fatics Library 


AMERICAN 
JOURNAL OF MATHEMATICS 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 


ABRAHAM COHEN F. D. MURNAGHAN 
THE JOHNS HOPKINS UNIVERSITY THE JOHNS HOPKINS UNIVERSITY 


T. H. HILDEBRANDT J. F. RITT 
UNIVERSITY OF MICHIGAN COLUMBIA UNIVERSITY 


R. L. WILDER 
UNIVERSITY OF MICHIGAN 


WITH THE COOPERATION OF 


E. T, BELL G. C. EVANS OYSTEIN ORE 

H. B. CURRY R. D. JAMES H. P. ROBERTSON 
E. J. MCSHANE ~ GABRIEL SZEGO M. H. STONE 
HANS RADEMACHER AUREL WINTNER T. Y. THOMAS 
OSCAR ZARISKI LEO ZIPPIN G. T. WHYBURN 


PUBLISHED UNDER THE JOINT AUSPICES OF 


THE JOHNS HOPKINS UNIVERSITY 


AND 
THE AMERICAN MATHEMATICAL SOCIETY 


Volume LXI, Number 2 
APRIL, 1939 


THE JOHNS HOPKINS PRESS 
BALTIMORE, MARYLAND 
U. S. A. 


APR 1 1 1920 
| 
| 
Cs 
| 
| 


CONTENTS 


PAGE 
Some results in the arithmetic theory of algebraic varieties. wid OscaR 
ZARISKI, > ‘ 249 


On simultaneous expansions of enalytin in composite 
series. By A. C. . 295 


Some theorems on quasi-analyticity for functions of a waited 
By 8S. Bocuner and A. E. Taytor, - 803 


Problems of the calculus of variations with prescribed transversality con- 
ditions. By D. MANCcILL, . 330 


Normal matrices over an arbitrary field of chasecietitle zero. By JOHN 
WILLIAMSON, > 335 


The Waring problem with 1 4 bz". “By Mary Hanenzerce 357 
On the units of indefinite quaternion algebras. By RaupH Hutt, - 3865 


~ Non-abelian compact connected transformation groups of three-space. 
By Deane Montcomery and ZIpPin, 3875 


A remark on normal extensions. By O. F. G. ScHILuine, . , 388 
Borel summability and Lambert series. By Tomiinson Fort, . - SF 
Singular maps of differentiable manifolds of n dimensions into n dimen- 
sional Euclidean space. By T. Y. THomas, . . 403 
The distribution of the maxima of a random curve. By S. QO. Rios, . 409 
The clamped square sheet. By D. G. Bourcin, . 


Tubes and spheres in n-spaces, and a class of statistical problems. By 
Harotp HOTELLING, . . 440 


On the volume of tubes. By HERMANN WEYL, . : , . 461 
Note on power series with big gaps. By M. Kac, . > : . 473 


Asymptotic distributions and statistical independence. By PHILIP 
Hartman, E. R. vAaN KAMPEN and AUREL WINTNER, 477 


The quaternion congruence fat = b ene 9). By R. E. O’Connor and 
G. PALL, 487 


Quasi-groups which satisfy generalized By 
. C. Murpocn, . 509 


On differential operators in Hilbert spaces. By Kur Taxsannie, 523 


THE AMERICAN JOURNAL OF MATHEMATICS will appear four times yearly. 

The subscription price of the JournaL for the current volume is $7.50 (foreign 
postage 50 cents); single numbers $2.00. 

A few complete sets of the JOURNAL remain on sale. 

Papers intended for publication in the JouURNAL may be sent to any of the Editors. 

Editorial communications may be sent to Dr. A. CoHEN at The Johns Hopkins 
University. 

Subscriptions to the JOURNAL and all business communications should be sent to 
Tue JoHNs Hopkins Press, BALTIMORE, MARYLAND, U.S. A. 


Entered as second-class matter at the Baltimore, Maryland, Postoffice, acceptance for mailing at special 
rate of postage provided for in Section 1103, Act of October 8, 1917, Authorized on July 3, 1918. 


PRINTED IN THE UNITED STATES OF AMERICA 
BY J. H. FURST COMPANY, BALTIMORE, MARYLAND 


ing 


te 

a 
: 
: 
‘ 
| 
ote 
i 
j 
| 
i 
= 
: 
: 
J 


\ 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC 
VARIETIES.* 


By Oscar ZARISKI. 


Introduction. In Part I we treat systematically some basic questions of 
the theory of singularities of an algebraic variety. The main results bear upon 
various characterizations of simple points of a variety. We begin with an 
ideal theoretic definition of a simple point (section 2). This definition is 
given of terms of relative unramified prime zero-dimensional ideals. We next 
characterize a simple point by the existence of a local uniformization satis- 
fying certain conditions (sections 3 and 4). At the same time we obtain a 
characterization of a simple point by an intrinsic (non-relative) property of 
the corresponding prime zero-dimensional ideal (Theorem 3.2). Finally we 
connect the singular points of a variety with the properties of the different 
(section (5-8)) and we exhibit an ideal whose variety is the manifold of 
singular points of the given variety (sections 8-9). 

In Part II we derive some properties of the conductor of a finite integral 
domain o with respect to its integral closure 0*. Those properties concern the 
usual question of the decomposition in o* of a prime ideal in o. The result is 
complete in the case that o is generated by the independent variables and a 
primitive element (sections 11-12). Also here it was possible to obtain an 
intrinsic result valid in o* (section 13). 

The contents of Parts I and II are, in the main, generalizations of well- 
known theorems in the arithmetic theory of algebraic functions of one variable. 

The contributions of Parts III and IV are new also in the case of func- 
tions of one variable. Here we introduce the concept of a normal variety, both 
in the affine and in the projective space, and we are led to a geometric inter- 
pretation of the operation of integral closure. The importance of normal 
varieties is due to their following two properties: 1) the singular manifold of 
a normal V, is of dimension S r—2 (in particular a normal curve (V,) is 
free from singularities) ; 2) the system of hyperplane sections of « normal V, 
is,complete. There is a definite class of normal: varieties associated with and 
birationally equivalent to a given variety V,. This class is obtained by a 
process of integral closure carried out in a suitable fashion for varieties in 
projective spaces. It turns out that the varieties of this class are those on 


* Received October 7, 1938. 


249 


250 OSCAR ZARISKI. 


which the hyperplanes cut out the complete systems | AC |, sufficiently high 
multiples of the system | C | of hyperplane sections of the given V,. These 
results seem to point to a fruitful arithmetic approach to questions of the 
birational theory of varieties. 

The special birational transformations effected by the operation of in- 
tegral closure, and the properties of normal surfaces, play an essential role 
in our arithmetic proof for the reduction of singularities of an algebraic surface, 
This proof will be published in the July issue of the Annals of Mathematics. 

Although the underlying field of coefficients is supposed throughout to be 
of characteristic zero, the proofs remain valid for any characteristic, provided 
only separable extensions are being considered. The separability is no restric- 
tion from the birational point of view, or even from the projective point of 
view. 

Many of the results of Parts I, II, and III have been announced by the 
author without proof in a Note of the Proceedings of the National Academy 
of Sciences [10]. 


I. Simple and multiple points of an algebraic variety. 


1. We consider an algebraic irreducible r-dimensional variety V,, in an 


affine space S,(a,° + *,@n), over an algebraically closed ground field K of 
characteristic zero. Let &,- + -,& be the codrdinates of the general point of 


V, and let 3 —K(é,,- + -,&,) be the field of rational functions on V,, of 
degree of transcendency r over K. We denote by o the ring K[&,- - -, én], 
whose elements are polynomials in the és. The defining ideal of V, in the 
polynomial ring K[a](=K[a,,---,¢n]) is the prime r-dimensional ideal 
p’,, consisting of all polynomials f(z,,- - -,2n) such that f(&,- -,én) 
0 is the ring of residual classes K[x]/p’, and & is the quotient field of o. 

The polynomials in K[x] which vanish at a given point P(a,,- ° -, dn) 
of S, form a prime 0-dimensional ideal = (4, +,%n—4dn). The 
point P is on V, if and only if p’,C p’o. The homomorphism between K[<] 
and K[£] sets up a one-to-one correspondence between the prime 0-dimensional 
ideals p’) in K[x] which contain p’, and the prime 0-dimensional ideals }o 
in (= K[é]); here pp p’,/p’,.. Thus there is a one-to-one correspondence 
betwen the points of V, (in the affine S,) and the prime 0-dimensional ideals 
in o: if P(a,,: --,dn) is a point on V,, then the ideal po = (€£; —M,°-°"; 
én — Gn) in 0 is prime and 0-dimensional, and conversely. If P is not on Vr, 
then (€,—4,° is the unit ideal. 

If p> is any prime 0-dimensional ideal in 0, then the ring 0/po contains 
a field K* = K (if CK anid c, co, then c, — ¥F0(Po), since Po is 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 251 


not the unit ideal), and moreover every element of 0/)» is algebraic over K*. 
Since K* is assumed algebraically closed, it follows that 0/p> coincides with 
the field K*.t| Thus every element w in o satisfies a congruence of the form 
w» =c(p.). We shall say that w has the value c at P. In particular, € =a; (po) 
and to there corresponds the point P(a,,- on Vy. 


2. DEFINITION. A point P(a,,---,dn) on V;, is said to be a simple 
point of V,, if there exist, in 0, r elements m,° yr such that the ideal 
nr) ts divisible by (is contained in) Po (= 0° (6: 
én —n)) but is not divisible by any proper primary ideal belonging to po. 


The condition in the above definition is equivalent to the following: 
po must be an isolated component of the ideal A. In fact, 2 =0(p.) and if 
)o were not an isolated component of 2%, then % must be divisible by some prime 
ideal p of dimension > 0 and contained in py (since, by hypothesis, no primary 
ideal belonging to p,) can be a component of 2%). This, however, is impossible, 
since it can be easily shown that any such prime ideal p is also a multiple of 
some primary ideal belonging to po.” 

We proceed to derive properties of a simple point which we shall have 
occasion to use in the sequel and which will also bear upon the geometric 
content of our definition. 

Let P(a:,- -+,4n) be a simple point of V, and let, according to our 
definition of a simple point, :,° - -,7r be r elements in o such that the ideal 
bo = (6. — +, — ad) is an isolated component of the ideal 
= (m,° * *,9r). We may assume, without loss of generality, that P is at 
the origin of coordinates, whence po = (é,,° +, &n). 

The elements 7; are polynomials in the és. In each of these polynomials 


the constant term is zero, since 7; =0(.). Let 
(1) ni = +: + Cinén + terms of degree = 2. 


n 

We assert that the r linear forms > cij€j are linearly independent modulo p,°. 
j=l 

To see this, we observe that if this were not the case, then also the r elements 


n 
m would be linearly dependent modulo since — > cij€; =0(Po?). Let, 
j=l 


say, 


1 That 0/Po is a field follows also from the fact that Do has no divisors other than 
p, and the unit ideal. 

*In the homomorphism 9 = 9 = 9/p, the ideal p,, as a divisor of p, is mapped 
upon a 0-dimensional ideal f, in 9, and the primary ideals belonging to Do correspond 
to primary ideals in g belonging to p, and containing p. 


he 


252 OSCAR ZARISKI. 


din + + dynr 0( 07), 


where d,,- - -,d, are in K and are not all zero. Let d- 0. It follows from 
the above congruence that 
(2) gr =0(m,° * Po"). 


We consider the ideal 8 = (m:,° - -,r-1). We observe that % is not the unit 
ideal, since Hence, by a well known theorem * the minimal 
(non-imbedded) prime ideals of 8 are all of dimension not less than 1. Since 
B C po, po must divide at least one of the minimal prime ideals of %, say p’. 
It follows, as has been pointed out in footnote 2, that there exist primary 
ideals belonging to py (and distinct from ),) which divide p’. There will also 
then exist a maximal primary ideal q, with this same property, i.e. a primary 
ideal qo belonging to p) and such that there are no ideals between py and qo. 
Now it is well known that each maximal primary ideal of pp, is a divisor of 
po”.4 Hence po? =0(qo), and since also 8 =0(p’) =0(qo), it follows, by 
(2), mr =0(qo). Hence 2% =0(qo), in contradiction with the hypothesis that 
p, is an isolated component of Y. 


In view of the linear independence of the forms > c,jé; mod p,?, it follows, 


a fortiori, that the matrix (cj) is of rank r. Hence by means of a non- 
singular linear homogeneous transformation of the codrdinates é; of the general 
point of V,, it can be arranged that the elements 7; have the following form: 


(3) + én), «+, 7), 


where f; is a polynomial whose terms are all of degree = 2. Now let w be 
any element in p>. Since ), is an isolated component of %, it follows that 
there exist an element 2 in 0, such that a40(po) and a =0(%). Since 
%=c(Po), cCK, this implies that =0(2, po”) i.e. 


o=Aym +: -+ Appr + B, a, 9, B= 0(o7). 


Replacing the 4: by their expressions in (3) and observing that £, as an ele- 


® See [2], p. 43 and the references on p. 45 to Macaulay and van der Waerden. 

‘See, for instance, [1], theorem 2, p. 529. Although the assertion is there proved 
only for rings in which the “ weak ” Doppelkettensatz holds true, the proof of this 
theorem, as well as of Theorem 3 and Corollary on p. 529, loc. cit., carry over to zero- 
dimensional ideals in arbitrary finite integral domains. On the other hand, our asser- 
tion is practically trivial if we observe that it obviously holds in the polynomial ring 
KI,,---,@,] for 0-dimensional ideals and if we consider the homomorphism between 


and K[é]. 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 253 


ment of ),?, can be expressed as a polynomial in the é’s in which all the terms 
are of degree = 2, we see that our arbitrary element » in py can be put in 


the form : 
(4) crér + 9(&,° €n), 
where g contains only terms of degree = 2. Or, in other words: there exist 
constants such that 
(5) + Cré-(Po?). 
THEOREM 1. Let 1, be elements of Yo; (be 


A necessary and sufficient condition that po be an isolated component of the 
ideal 1s that the determinant | ci; | be different from zero. 


Proof. Assume | cj | 340. Given any element in po, it is then possible 
to find constants d,,- - -,dr, such that o = +: - + dywr(Po?). Hence 
Po = (01,° * *,r, Po”), and this implies that p, is an isolated component of 
the ideal (#:;,: --+,o,r) (in view of the fact that p,.? is a multiple of every 
maximal primary ideal belonging to p,; see footnote 4). 

Assume | ci; |= 0. There exist then constants -,d,, not all zero, 
such that d,o, +--+: d,w,=0(.7). If, for instance, d, 40, then it 
follows that w, =0(1,° *,or-1, 07). This congruence is analogous to the 
congruence (2), encountered above, and leads therefore to a similar conclusion, 
e.g. that the ideal (w;,° - -,,) is a multiple of some maximal primary ideal 
qo belonging to fo, q. e. d. 

In particular, if we take for o,,::-°,o, linear forms &,:°--,é& in 
&,° ° *,&n, then, for non-special values of the coefficients of these forms, the 
determinant | ci; | will be different from zero. Moreover, by a well known 
“normalization theorem,” * for non-special values of these coefficients the 


elements €,,- - -, & have the property that they are algebraically independent ¢ 
and that every element in o is integrally dependent on €,,- - -,&. Hence our 


definition of a simple point P can now be completed by the following remark: 
the r elements enjoying the property that po[= 
*See [2], p. 41. 
°The algebraic independence is already implied by the non-vanishing of the 
determinant | c,; |, a8 @ consequence of the fact that j, is then an isolated component 


of the ideal (é, ++, €). Namely, if the é’s are algebraically dependent, it is per- 


missible to assume that is integrally dependent on In view of the 
algebraic closure of the ground field, this implies that if p is any minimal prime ideal 
of the ideal (é,, eee é._.) which is a multiple of Po» then é., and hence also the ideal 


is divisible by p- 


\ 


254 OSCAR ZARISKI. 


(the point P being at the origin) is an isolated component of the ideal 
(m,° * *,7r), can be chosen in such a manner that, in addition, every element 
in o be integrally dependent on ,°-° -,9r3; in particular, r suitable linear 
forms in é,,° * -,&, will meet both these requirements. We make the neces- 
sary transformation of the codrdinates é; and we assume from now on that 


these linear forms are é,,- - -, é respectively. 


3. Let again m,° °°, be r elements in o with the property that }, 
is an isolated component of the ideal (y:,- - -, yr). As has been pointed out 
above (footnote 6), the y’s are algebraically independent. We introduce the 
ring of all formal power seriés in with 
coefficients in K. Any element in K{y} can be written in the form 


Yotwt+: : Ym +° om + Rina, 
where y; is a form of degree in and where dm = fi Yi 


THEOREM 2. There exisls an isomorphic mapping of the ring 0 upon 
a subring of K{n},7 with the following property: if w is any element in o and 
if Pt+ui+-:--: is the corresponding element in K{n}, then, for all m, 


Proof. We shall first prove the theorem in the case when m,° °°, 9, 
coincide with é,,- - -,é- respectively. Its validity in the general case will 


then be an immediate consequence. 

We first show that given any element o in o there exists a polynomial 
om(é:,° ° *,&-) of degree =m, such that the congruence (6) holds true. 
The assertion is trivial for m= 0, and has been proved for m ==1 (see con- 
gruence (5)). We assume that the assertion is true for m —17 and we prove 
it for m=i+1. Let then w= d¢i(po'*'), where di is a 
polynomial of degree =1. We can write: 


o = di + fin (&,° én) +: fim ° * Ga), 


where f; is a form of degree j. Let 


= Cij€j(Po")> (t= 1,2,---,n—r). 


™In a more precise language: there exists an isomorphic mapping of 9 upon a 
subring of K{u,,---, u,} (the u’s being parameters) in which 9; >,, etc. What we 
did is to identify the 7’s with the w’s. 


= 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 255 


Then it is clear that 
j=l j=1 


and that moreover f;(&,-° - én) =0(o**?), if 7 >i +1. Hence, if we put 


r r 
dist = + 
then 
= 
which proves our assertion. 

We next show that, given the element w, the polynomial ¢m in (6) is 
uniquely determined. We shall prove this by induction with respect to m, 
since for m = 0 the assertion is trivial. Let us assume that there exist two 
polynomials, ¢m(é:,° and ¢$’m(&,° &-), both of degree = m, such 
that © = dm (Po"*"), = ¢'m(Po™*"). Let dm = dm-1 + Ym, = m-1 ens 
where dm-1 and @’m-, are of degree = m—1 and ym and ym are forms of 
degree m. Since Ym and ym are in po”, it follows that o=¢m-_1(Po™) and 
» = ¢'m-1(Po”). By our induction we conclude that dm1—¢’m1. Hence 
dm — = Yn — = gm, and gm=0(Po™), where gm is a form in 
*,&,-, not identically zero, of degree m. 

Let us denote the ideal (é,,- - -,é-) by %&f. By hypothesis, po is an iso- 
lated component of %. Moreover, since every element in o is integrally 


dependent on €,,- -- ,é&,, 2 is unmixed, all its components being necessarily 
zero-dimensional. Let = Since all the components are 


zero-dimensional, no two of them have common divisors. Hence their inter- 
section coincides with their product, and we can write YM = poqi: + - qs. Now 
9m = 0(2"), i.e. == [Po™, On the other 
hand gm whence gm C [po™*?, 5 = 


Hence = () (Do 


This last congruence shows that g» can be expressed as a form f(&,° °°, &r) 
in +, of degree m, with coefficients in p>. Consider the norm N(f) 
of f over K(é,,---,&-). Clearly N(f) = gm”, where v= [3%: K(&,°--,&)], 
the relative degree of 3 over K(é,,---, &-). On the other hand N(f) is a form 


of degree vm in é,,° -,&,, and its coefficients are polynomials in &,° , 
(since the coefficients of f are integrally dependent on &,---,&-) which 
vanish for - (since the coefficients of f are in po).° 


® All this can been seen as follows. Let u,,u,,---, Uy be the various power products 


tr, i, +i, t---+t,=m. Then f is of the form f= + Ua, 


256 OSCAR ZARISK] 


We have therefore the following result: the form gm” in &,- - -,&,, of 
degree my, is equal to a polynomial in é,,- - -,&,, having no terms of degree 
<mv+1. This is in contradiction with the algebraic independence of 
é:,: ° °,&. The uniqueness of the polynomial ¢m in (6) is thus established. 

The formula (7) shows that if o=¢m(Po"*) and o = dm (Po”*”), then 
— is form Wms in of degree m+ 1. We let correspond 
to w the power series 


(8) 


It is now a simple matter to show that (8) defines an isomorphic mapping of 
o upon a subring H of K{é,,- - -, é,}. 
(a) First, let » and o” be distinct elements of o, and let 


Assume that yi = wi, for all 7. Since 
™m™ ™ 
yi =0(po™*) and wo — =0(po""), 
=0 i=0 
it would follow that »o—w’ =0(,.*'), for all m. This is impossible if 


w ~ w’, since zero is the only element which is common to all the powers of )o. 
Hence, to two distinct elements of o there correspond two distinct power series. 


(b) We have +0/—3 (yi + W's) =0(po™), hence 


+ Uy ty, 0 (py), j=1, 2,---,M. Let f(), f(2),...,f(v-1) be the conjugates of f 
over K (é,, = ua, (i) + Ua, (4) + wherea,(1),... ya, 
are the conjugates of a;. The conjugates of any a; are integrally dependent on £,,-- -, ¢,- 
Hence N (f) (= ff()..-f(’-1)) is a form of degree vm in &,---,&. with coefficients 
which are polynomials in £,,---,é. We consider the least Galois extension of 
K(é,, --+,&.) which contains the field 2. We also consider the smallest ring y* con- 
taining g and its conjugate rings, i.e. the ring 9* = (03 
+5 Let p,* be any prime ideal divisor of in 9*. The 
coefficients of N(f), considered as a form of degree vm in &»-++,§,, clearly belong to 
the ideal (a,,---,a,,). Since a, =O0(p,) = 0(p,*), all these coefficients belong to 
Po": On the other hand, these coefficients are in KEE,, --+,&,], and it is clear that the 
intersection K[é,,- - -,&,] is the prime 0-dimensional ideal (£,,. - .,&,.). Hence 
the coefficients of N(f) are polynomials in —>:-+,&., without constant term. 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 257 


(c) Let W=g=ga+ga+:::. Clearly, > gi differs from 
i=0 


Wi‘ >i by terms of degree > m, whence 
i=0 


=0 i=0 i=0 


4 


On the other hand, we have 


whence 


(9’) ow — Dwi =O0(Po™"). 
i=0 i=0 
Adding (9) and (9’), we obtain: 


and this shows that to ww’ there corresponds the power series yy’. This com- 
pletes the proof of Theorem 2 for the case when m,- - -,y,r coincide with 

The general case can now be settled in a few words. Let us consider the 
power series which correspond to m,° °°, r in our mapping of o upon the 
subring H of K{é,,- - -,é-}. The constant term in each of these power series 


is zero, since Let 


(10) ni > = +++ + cir€, + terms of higher degree, H, 
(1=1,2,: 


Since py is an isolated component of the ideal (m,---, mr), it follows by 
Theorem 1, that the determinant | ci; | is different from zero. It is therefore 
possible to “ solve” the relations (10) for é,,- - -,&- and to express each & as 
a power series in 7*,,° In other terms: the ring K{é,,- - -,&-} can 
be mapped isomorphically upon the ring K{m,- - -,9r}. In this mapping the 
subring H of K{é,,- - -,&} is mapped upon a subring H’ of K{m,- - -, ar} 


and 9 =H’. Let » be any element in o and let 
co 
or g CH, 
4=0 


Where g; is a form in m,°‘* +, of degree i. The two power series in 


m m 
4=0 
m m 


258 OSCAR ZARISKI. 


K{é,,- - -,&-} which correspond to w and to > gi respectively, differ by terms 
i-0 
m 


of degree > m. Hence o — > gi=0(0™"), d. 


i=0 
In the sequel we shall feel free to identify any element » with its corre- 


sponding power series >) gi in *,7r, and we shall write o— We 
i=0 i=0 


shall say that } gi is the expansion of w in a power series of m,°**,yr- We 
i=0 


shall also refer to the isomorphic mapping of o upon the subring // of 
K{m,° r} as a uniformization at P. The elements m,- -, shall be 


called uniformizing parameters. 


4. Geometrically speaking, Theorem 2, in the special case 7; = &, says 
that the variety V, possesses at the point P(0,---,0) a linear analytical 
branch, whose tangent S, at P has only the point P in common with the S,_, 
given by the equation é;=-+ - -=—é€,=—0. We have not yet shown that there 
are no other branches at P, and that consequently P is indeed a simple point 


in the ordinary geometric sense.® 


® Actually the existence of a linear analytical branch at P is implied already by 
a part of Theorem 2. Namely, if we only knew that there exists an isomorphic mapping 
of g upon a subring of K {&,)- , -,&,} such that the congruence (6) holds true for 
m = 0, we could already associate with this mapping a linear branch at P(0,. - -,0). 
It will follow from Theorem 4 that it is the validity of the congruence (6) for 
all m that implies that there are no other branches through P. Thus the existence 
of an isomorphic mapping of 9 upon a subring of K{é>-++5§.} together with the 
validity of the congruence (6) for all m characterizes a simple point of V,. In this 
cOhnection we wish to call attention to the following question: assuming po to be 
integrally closed (i.e. assuming that V, is normal in the affine space (section 14) ), ts 
it true that the neighborhood of every point of V, is an analytically irreducible variety? 
For algebraic functions of one variable the answer is well-known to be affirmative. It 
seems to us that for functions of several variables the question presents serious diffi- 
culties. An equivalent ideal theoretic formulation of the question is the following: 
Let w be a primitive element of 9 over K(é,,---,&,), 9 being assumed to be the integral 
closure of KI[é,, --+5&.] in 2. Let F(é,, --+,&,@) =0 be the defining (irreducible) 
equation for w. Over the field of meromorphic functions of &,---,&. the polynomial 
F factors: F,F,-.-F,, where F, is a polynomial in w with coefficients in K{é,,---,&,} 
and leading coefficient 1. It is not difficult to associate with each factor F'; a definite 
prime 0-dimensional ideal in g which is a divisor of g.(&,,---,&,.) (see next section). 
The question is: do there correspond to distinct factors distinct ideals? The methods 
used in this connection for algebraic functions of one variable and for fields of algebraic 
numbers (see [4]) break down in the case of algebraic functions of more than one 
variable. If the answer to this question was affirmative, then the following could be 
proved: if 9 is integrally closed and if there exists an isomorphic mapping of 9 upon a 
subring of K{é,, oe »€,} such that (6) holds true for m=O, then (6) holds true for 


j 

[ 

f 
a 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 259 


We will show, however, that there exists a non-singular projection V’, of 
V, into an Srii(41,° Yrsr), Such that if f(41,° is the equa- 
tion of V’,, then the point P is projected into a point P’ of V’, at which not 
all the derivatives f,, vanish and which is therefore a simple point of V’,, 
in the ordinary sense of this term. 

In addition to this we shall also wish to show that the existence of a 
uniformization at P with the property described in Theorem 2 (i. e. congruence 
(6), for all m) implies that P is a simple point of V, (in the sense of our 
definition). Accordingly, we consider a point P of V,, which we shall assume 
to be the origin (whence the corresponding prime 0-dimensional ideal in 0 is 
Po = (&,° +, én)) and we make the following assumptions: there exist in o 
r uniformizing parameters m,°-**,nr for the point P, i.e. there exists an 
isomorphic mapping of o upon a subring of K{m,- + -,yr} in which every 
element of is mapped upon a power series in yr Which has no con- 
stant term (congruence (6), for m=0!). We also assume that this uni- 
formization has the property that if to an element w in o there corresponds the 
power series Wo then for all m 
(congruence (6), for all m). When these assumptions are satisfied, we shall 
say briefly that there exists a wniformization of the whole neighborhood of P. 


THEOREM 3. Jf m,°°*, nr are uniformizing parameters for a uniformiza- 
tion of the whole neighborhood of the point P(0,---+,0), then Yo is an 
isolated component of the ideal 0-(m,°*-,r) and hence P is a-simple 
point of V;,. 


Proof. By hypothesis, any element w in o satisfies a congruence of the 
form = + Crm +: ++ Crpr(Po?), and ¢o 0 if o=0(p.). It follows 
immediately that po (m,°°*,7r, P07), and this implies (see proof of 
Theorem 1) that is an isolated component of the ideal 0 - q.e- 4. 


We observe that we have used in the proof only the congruence 
©=% + ,(Po”). We can therefore state the preceding theorem in the fol- 


lowing stronger form: 


THEOREM 3.1. Jf there exist r elements nr such that every 
element w in o satisfies a congruence of the form + Cim+° 
+ ¢rnr(Po?) (congruence (6), for m=0,1 only!), then po is an rsolated 
component of the ideal 0: (m,* + *,yr) and P is a simple point of Vy. 


all m, and hence P is a simple point. Geometrically speaking, this statement signifies 
just this: the analytical irreducibility of the neighborhood of P together with the 
existence of a linear branch through P characterize P as a simple point. 


260 OSCAR ZARISKI. 


The assumption in Theorem 3.1 implies that the elements m,- ° -, 7, 
form a modular basis of the ring )o/)o” considered as a K-module. Moreover, 
since is, according to this theorem, an isolated component of , yr), 
we know, from section 2, that are linearly independent modulo 
Hence }o/Po" is a K-module of rank r. Conversely, if po/po? is of rank Sr, 
then there exist r elements in 0, say m1,° * *, yr, such that every other element 
w iN Pp satisfies a congruence of the form = +: + Cryr(Po?). Hence 
Po = (m,° * *,7r, Po”), and from this it follows, as in section 2, that po» is an 
isolated component of the ideal and that °°, mr are 
linearly independent mod p,?. Hence P is a simple point and necessarily 
Po/Po? is of rank exactly 7. We therefore have the following 


THEOREM 3.2. If P is a point on V, and fp is the corresponding prime 
0-dimensional ideal in 0, a necessary and sufficient condition that P be a simple 
point is that the K-module o/)o? be of rank r. 


5. Let P(0,---,0) be a simple point of V,. We assume, as in the 
preceding section, that a linear homogeneous transformation of the codrdinates 
é; has already been performed, so as to make é,- - -,é, uniformizing para- 
meters for a uniformization of the whole neighborhood of the point P, and, 
moreover, that every element in o is integrally dependent on K[é&,- - -, é]. 

The field = K(é,- -,&,) is an algebraic extension of K(é&,- - -,é). 
Let m be the relative degree of = over K(é,: - -,é,-). Ifo is an element in 9, 
then 

N(z—wo) = + +---+Am, 


where A;,--~-,Am are polynomials in Reducing the equation 
F(w;é,- +, =0 modulo), we find F(c, 0,- - -, 0) = 0, where 
w =c(p.). Hence is a root of F(z,0,---,0). 

Since every element in o is integrally dependent on K[é,,- - -,&], the 
ideal o- -,&-) is unmixed and zero-dimensional, and is one of its 
components. Let [po, q1,° °°, qs] be the decomposition of the ideal 
- into primary components and let po, Ps be the asso- 
ciated prime zero-dimensional ideals. 


THeorEM 4. Under the assumption that &,:---,&- are uniformizing 
parameters for a uniformization of the whole neighborhood of the simple 
point P(0,- - -,0) on V, and that every element in 0 is integrally dependent 
on there exist elements in o such that F’y(w;&,° &) 
0(po), where F(z; &,: +,&,-) is the norm of z—w over 
Such elements w are characterized by the following condition: if w has the 


ig 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 261 


value c at P (i.e. if w=C(po)) and if pi,- +, ps are the prime 0-dimensional 
ideals, other than which divide the ideal 0- (é:,- - -,ér), then 
jew i, * 


We give here two proofs of this theorem. The first is based upon the 
factorization of F(z;&,--+-,é-) in the ring K{é&,---,é-}[z] and upon 
higher congruences to which this factorization leads.’° 

The second proof makes use of the least Galois extension field which con- 
tains our field and its conjugates over - -,&,). 


First proof. Elements » in o satisfying the conditions »=c(Po), 
o¥#c(pi), 1=1,2,---,s, certainly exist. Namely, since no two of the 
ideals Po, )i1,° - °,)s have common divisors, it is well known that given any 
s+1 elements a, *,% in o, there exist elements in o satisfying the 
congruences = a; (pi), 1 = 0,1,---,s. To satisfy the above conditions, we 
have only to take for %,: ,@ any s+ 1 constants in K, say Co, *, Ce; 
such that ~ ci, 1 = 1,2,: 5. 

It is not difficult to prove that there also exist primitive elements w over 
K(é.,° +, &-) satisfying the above conditions (as a matter of fact, we shall 
prove later on that these conditions imply that w is a primitive element). To 
show this we consider an element ¢ in o satisfying the congruences = d;(p;), 
where do, d,,° - -, ds are distinct constants. Let ¢’ be some primitive element 
in o and let ¢’=d’i(pi). If ¢ is a parameter, then the discriminant of 
N(z—¢—1£) is a polynomial D(t;é,,---,&-) which does not vanish 
identically in ¢, since D(0,é,---,é,-) is the discriminant of N(z— ¢’) and 
does not vanish. Let Aij(t) = (d's —d’j)+ t(di — dj), = 0,1, 2,---,8, 
Since dj; if ij, none of these polynomials vanishes identically. 
Let be a value of ¢ (in K) such that D(to, &,- - -,&) 40, Aij(to) 
Then the element » = {+ tof is obviously a primitive element. Moreover, 
we have a’; -+- todi (Di) and da’; tod; d’; tod’ ;, if since 
Ai; (to) 0. 

Let then » be a primitive element in o over K(é,,- +, &), o=ci(p:), 
OAC, 1—1,2,---,s. Let 


N(z—wo) = +, &) = F(z; €). 


9 This proof follows in part a well-known pattern of the theory of algebraic func- 
tions of one variable and of algebraic number fields. See [4]. The generalization is made 
possible by the existence of the uniformization. In the case of algebraic functions of 
one variable and of algebraic number fields the proof is based upon the fact that the 
ring of polynomials in one variable and the ring of rational integers are principal 
ideal rings. 


262 OSCAR ZARISKI. 


Here F is a polynomial in z and of degree m (= [3%: -, é) ]) 
in z and with leading coefficient 1. 

Let 
(11) P= F(z; €) F2(25€) Fa(z3&) 


be the decomposition of F into prime factors in the ring K{é,- - -, &}[z] 
of polynomials in z with coefficients in the ring of formal power series in 
&:,:° °,&. Since the discriminant of F is different from zero, the prime 
factors F; are distinct. Moreover, each factor Fi, of degree m; in z, may be 
assumed to have leading coefficient 1. We have m= m,-+-: 

Since F;(z;) is a prime element in K{é}[z], it follows by Weierstrass’ 
preparation theorem ** that the polynomial F';(z;0,0,- --,0) in K[z] can- 
not have two distinct roots. Hence 


(12) CK. 


The h constants d,,d2,- - -,dn (not necessarily distinct) are the roots of 
F(s,0,- - -,0): 


j=l 

Every one of the constants ci, i=0,1,- - -,s, is equal to some dj. In fact, 
reducing the equation modulo p;, we find F(c;;0,- -,9) 
= (0, i.e. c is a root of F(z;0,---,0). We shall have occasion to prove 
later (footnote 12) that, conversely, each of the constants dj 1s equal to 
some ci. At any rate, if the constants ¢o,¢,,: * *,¢s were distinct, then 
necessarily h = s+ 1. 

Let o= be the expansion of into a power series 
in é,,---,&. Here doco. Since the mapping »— ¢ of o upon the sub- 
ring H of K{é,,- - -, &-} is an isomorphism, the equation F(w; é) = 0 implies 


that ¢ is a root of F(z;€). Hence z—¢ is a factor of F. Let, say 


F(z; €) = 
whence m, —1 and 
h 
(13) F(z,0,- -,0) = (z—co) (e—dj)™, 
j=2 


We wish to show that the constants dz,- - -,dy are all different from Co, 1. 
that co is a simple root of F(z,0,---,0). This will imply that 


4 See [5], p. 261-262. 


= 

| 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 263 


+, &r) FO (Po), 


as was asserted in our theorem. For the proof, let us denote by Fyp(z; &) 
the polynomial in z and in é,,: - -, & obtained from Fj(z;€) by omitting all 


h 
terms of degree >p in It is clear that é) Fyp(z; €) 
j=l 


is a polynomial which does not contain terms of degree < p+ 1 in &. 
It follows that 
h 
(14) TI Fip(w; €) 
j=l 


or also, since 0° = Poi: 
h 

j=1 


Now suppose that one of the constants dz,- - -,dy coincides with co(=d,), 
and let, for instance d, = c). We show that this hypothesis leads to a 
contradiction. It is clear that Fjp(z;0,- -,0) = (2—dj;)™. Hence 
Pip(w; €) = (w— (pi), i= 0,1,- In particular, for =h, we 
have 


Fip(w; €) = (ci — dn)™ (i), (4 == 0, 1,- 
Since = and ¢) ~ci, i= 1, -,s, we conclude that 
(15) Fnp(o; €) FO(pi), (1 = 1, 2,- *,8). 


From this we derive, in view of (14’), that 


h-1 
II Fjp(; €) = 0(q°"" 
j=1 

On the other hand we have 


(15’) Fp €) =o — (do + bp) =0( 


Consequently, if we put 
h-1 
(16) =I Fip(;€), 
j=l 
then 
or 12 


“Tf d, was distinct from any one of the constants c;,, 1=0, 1,---,8, then (15) 
would hold true for i= 0, 1,... ,s and hence (16’) would follow directly, independently 


264 OSCAR ZARISKI. 


(16) Gp(w; (eo =0,1,---). 


In view of (16’) it must be possible to express Gp(w;€) as a form of 
degree p+ 1 in with coefficients which are elements in 0. Now 
any element in o is an integral function of é,,- --,é&- and hence can be 
expressed in the form g(w;é)/D(é), where g is a polynomial and D is the 
discriminant of F'(z;€). Hence we may write Gp(o;€é) in the form: 


(17) Gp(w; €) = gp(o; €)/D(é), 


where gp is a polynomial in w, of degree = m—1, with coefficients which 
are polynomials in é,,---,é- having no terms of degree <<p+1. Now 
Gp(w;é), according to (16), is of degree m — m, S[ m—1inw. Hence (17) 
must be an identity in w, &,- --,&-. This is impossible, if p is sufficiently 
high. Namely, if the terms of lowest degree in D are of degree g, then the 
coefficient of the highest power of » in D-Gp(w;é) begins with terms of 
degree q (since the leading coefficient of Gp is 1), while gp(w; €) has no terms 
of degree <<p+1 in &,---,é. Hence, if p+1> 4, the relation (17) is 
impossible. 

We have thus shown that there exist elements w in o with the property 
that, if F(z; = N(z—»), then F’,.(o; €) #0(po). Namely, we have also 
shown that if o=ci(pi), i=0,1,---,s, then o certainly enjoys this property, 
provided it is a primitive element over K(é,---,é-) and provided ¢ -~ ¢i, 
1,2,---,8. 

We now wish to prove that the first provision is a consequence of the 


second. Let be any element in 0, £=0i(pi), 1—0,1,- +--+, 8, and let us 
assume that bo ~ bi, i= 1,2,- --,s. We have to show that ¢ is a primitive 
element. 

We fix in 0 a primitive element such that w= ci(pi) and Co, 5 Cs 


are distinct constants. Let N(z—w) =F (z;€) and let us again consider 
the factorization (11) of F in K{é,,- - -,&-}[z]. Let K*{&,- - -, &} denote 
the field of meromorphic functions of &,: - -,&- and let 3; be the algebraic 
extension of this field defined by the irreducible equation Fi(w; é) = 9. 
Our field > = K(é,---,é-,o) can be regarded as a subfield of 3;.* Let 


of (15’), i.e. independently of the assumption that P is a simple point. The remainder 
of the proof is based on the impossibility of the congruences (16’). Hence this also 
proves that any d, must coincide with some c;. 

18 The following are well-known facts (see [7], p. 47). The ring of residual classes 
of K*{é,,---,&,} [2] modulo F contains the field = and is the direct sum 2, + --- +> 
of the fields =,; each field 2, contains a subfield 2,(0) = 2, and the decomposition 
a=a,+a,+-+-+4,,aC 2,a; C sets up an isomorphism a, >a, between 


q 
i 
j 
H 
f 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 265 


Ni(z—£) denote the norm of z— ¢ over K*{é,,- - -,é-}, when ¢ is regarded 
as an element of 3;. Since, in 3, ¢ is an integral function of &,- - -,é,, it 


follows that Ni (z—)is a polynomial in z with coefficients in +, &} 
and leading coefficient 1. Moreover, if N(z— ¢) denotes the usual norm of 


z—, when ¢ is regarded as an element of &, then 
h 
N(z—l) =[[ Ni(z—£). 
4=1 


In particular, since F,(z; £) = z— ¢, @C K{&,- - &}, we have 
Ni(z—¢) where (the expansion of ¢) and 
where Yo = Do. 

Now let us assume that ¢ is not a primitive element. That N(z—) is 
the power of a polynomial of degree < m, whence one of the factors Ni (z—), 
11, must be divisible by z—y. Let it be N2(z—€). Then N2(z—£) is 


necessarily a power of z—y: 


(18) N.(z—£) = (2—y)™. 

We consider in 3; the ring 0; = K{é&,- - &} and the ideal 
po? = 0;° Po. It is clear that po‘ contains the ideal of the non-units in 
K{é,- - -,&-} and that =K. Thus is prime and every element 


in satisfies a congruence of the form eC K. By (18), 
as an element of the field %,, satisfies the relation —y(é,°--,é-) =9, 
whence i. do (Po), since be. We consider the con- 
tracted ideal This ideal contains the elements é,,: More- 
over, it also contains » — since the equation = 0 yields the con- 
gruence (w — (by (12’)). Now po, +, Ps are the only 


prime ideals in 9 which divide the ideal o- (é,,: --,&-). Since poo 
is a prime (-dimensional ideal, it must therefore coincide with one of the 
ideals It cannot coincide with po, since dz co; therefore 


o—d,540(po). Hence po?) is one of the ideals pi, iA 0. Let, say, 
po" ©o =p, We have then £=b,(p,), and this is in contradiction with 
our hypothesis by) ~ bi,i 40. Hence ¢ is a primitive element, as was asserted. 

To complete the proof of the theorem, we still have to show that, con- 


versely, implies co ci, 1,2,- -,8, where, 
again F(z; é,-- +, &) = N(z—w) and o=ci(pi). The hypothesis 


A0(po) signifies that co is a simple root of F(z; 0,---,0). Hence 


and (0), In particular we have w=w, ,, where + +56.) = 0. 
There can be no confusion if, for a fixed i, we identify w with w; and 2 with 2,9). 


2 


266 OSCAR ZARISKI. 


the discriminant D(é,,- -,&-) of F(z; &) does not vanish, and consequently 
w is a primitive element. Let ¢ be an element in o which assumes distinct 
values bo, b;,: at Po, *,s, and let, in our preceding notations, 
N(z—) = G(z;&), Ni(z—l) = Gi(z;é). Here G,(z;é) = y, 
—y2—:-:-. Since z—b, is a factor of G(z,0,---,0), one of the poly- 
nomials G;(z;0,---,0), 141, must be a power of z—b,. Let, say, 
G2(z;0,--+,0) (z—b,)™. Then it follows that and we 
conclude as before that po?) 00 = p,. Now hence d2(p,), 


i.e. dz—=c,. Since cy) is a simple root of F(z;0,---,0), the remaining 
roots are all distinct from d, (—c)). Consequently c,~%. 
Similarly + A Co, and this completes the proof of our theorem, 


6. Second proof. Let & =&"; &®,---,&™ be the conjugates of 
over K(é,,- -,&) and let be the Galois extension 


field obtained by adjoining to K(é,---,&,-) the elements - +, and 


their conjugates. Let = K[é,:- +, and let o* be the 
smallest ring in which contains the m conjugate rings 02,° Om.’* 


By the isomorphism &“) between (—o0,) and oj, the ideal 
in o is carried into a prime 0-dimensional ideal po‘) in oj. Again the m ideals 
po‘), in general, need not be distinct. However, under the hypothesis of the 
theorem we prove that not only are they distinct, but that also any two of the 
extended ideals 0*po\4) in o* have no common divisors. 

Let us consider, for instance, o*p. and o*p,?). These ideals are 
unmixed and 0-dimensional, since every element in o* is integrally dependent 
on 0;. Hence if they have a common divisor, they also have a common prime 
0-dimensional divisor, say p* ). Let ww, be a primitive element in o over 
K(é,°--,&-) and let w. be its conjugate in 02. By Theorem 2 we have 
pansion of », into a power series of é,,- - -,&. Applying a substitution of 
the Galois group of 3*/K(é,---,&-) which carries 0, into 02, we get: 
=Yo twit: Since p*, is a common divisor of po"? 
and it follows =o +: + Wi-s(p*o') and +: + (P*o'), 
whence 

— w2 = (t= 1,2,° °°). 


The validity of this congruence for all i implies o, =2, in contradiction 
with the hypothesis that w, is a primitive element. This proves that o*p 
and o*p,‘?) have no common divisors. 

Let 


** These m rings g; are not necessarily distinct. 


0 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 267 


be the decomposition of o*p)‘/) into maximal primary (0-dimensional) com- 
ponents. Let qa‘) belong to the prime ideal p,‘). The prime ideals 
+ +, are the conjugates of the ideals p,“,- - -, po") respectively, 
under a substitution of the Galois group of 3* which carries 0, into 0;. Since 
any two of the ideals 9*po'4) have no common divisors, the om ideals pq‘? 
are all distinct. 

Let now » wo, be an element of 0 (—o0,) such that w=c (po) and 
wo FCo(pi), += 1,2,---,8. Since o=co(po), it follows from (19) that 


(20) wo — Co =0(ba"™?), (@ =m 1,2,---,a). 


We have o- (é,,: whence o*(é,,° -,é) Spa, 2, 

-,a. Since the ideal o*(&,- --,&-) is invariant under all the substitu- 
tions of the Galois group of 3*, it follows o*(é,---,&) & pa", ie. any 
of the mo ideals is a divisor of Since po, Ds 
are the prime ideals of o- (é,° - +,&-), any pa"/) must be a divisor of one 
of these ideals. Now if 7 1, then pg‘! is not a divisor of po (= po"), since 
the prime ideals pg") of o*p, are all distinct from any Hence, 
if 741, we must have, for some B0, pg | pa"/). Since, by assumption, 


— Co if it follows 
(21) — Co (j = 


We now consider one of the ideals pg"), say p,%. Let 2,° +, om 
be the conjugates of w= w,; here wo; Co; and we do not yet assert that 
,° * *,@m are distinct, since we do not know whether w is a primitive ele- 
ment. A substitution of the Galois group of }* which carries 0, into 02, will 
carry w, into w, and also some ring 0;, 7 ~1, into 0,. Hence it will carry 
some ideal pa‘J), 741, into p,“). Applying this substitution to (21) we 
obtain : 


w2 — Co 


In a similar manner we get — #0(p,")), 7 = Reassuming 


and recalling (20), we have: 
(22) == 0(p,), — Co 0(pi"™), em — Co HO (pi). 


From (22) it follows that 0, ~w;, 7 ~1, whence w is a primitive element. 
Moreover, let N(z—w) =F (z;&,- +, é) = (z—o1)* om), and 
let wj = ¢;(p,"")), 7 = 2,---, m. If we reduce the coefficients of F(z; ér) 
-,& by zeros. Hence F(z;0,---,0) 


mod p,), we must replace - 
Since cj; Co, it follows that c) is a 


= (z— (z— C2) (2—Cm). 


268 OSCAR ZARISKI. 


simple root of F(z;0,---,0). Hence -,&) as was 
asserted. 

Conversely, let us assume that cy is a simple root of F(z;0,---,0). 
This implies that oj 7 A 1, a—1,2,- 0, since o, =Co(Pa"). 
Applying the substitutions of the Galois group of &3* which carry o; into o, 
we get (21), and from this, retracing our reasoning which led to (21), we 
conclude that (pi), t= 1,2,: --,s. Thus all the assertions of our 


theorem are proved. 


7. It is clear that Theorem 4 remains true if we replace é,- - -,é, by 
any other set of uniformizing parameters m,°*~°,7r in o, such that every 
element in o is integrally dependent on m,° Let now m,° nr be any 
set of r algebraically independent elements in 0, such that every element in 0 
is integrally dependent on +, yr. Let r) be the different 
of an element » in o, where is the norm N(z—wo) over 
We consider the ideal Zp,,..., generated by the 
differents F’,, as w varies arbitrarily in o (the h.c.d. of the ideals o- F’., 
w-arbitrary ino). Let po be a prime zero-dimensional ideal in 0, P—the corre- 
sponding point of V,, and let »i=ci(po). If po is an isolated component 


of the ideal 0- (4: then P is a simple point of V, and 
are uniformizing parameters for the whole neighbor- 


hood of P. It follows then, by Theorem 4, that for some in o it is true that 
Hence Ziq O0(po). 

Conversely, let us assume that Zn) 40()o). There exists then an element 
» in o for which F’,, 4 0(o), i. e. such that o = do(Po) and dy is a simple root 
where qo, 4:,° * *, 4s are all 0-dimensional and qo belongs to po. If we reduce 
the equation F'(w; ,° modulo we get ¢1,° cr) == 0(M) or 


m-1 


(23) (o—do) (o—di) =0(N), di Ado, if 
i=1 


where m is the relative degree of over K(m,° yr). Since o—di;F0(Po), 
1 0, tt follows by (23), that o — dy =0(qo). 

Let now £ be any element in )o. We form the norm N(z—€), where 
€= {+ t—a parameter. This norm is a polynomial G(z;t,m,° 1) 
which for becomes F(z; yr). Moreover > 
= H(t), where H(t) is a polynomial in ¢ with coefficients in 0. Now 
H(0) = F’, 40(po), hence the coefficients of H(t) are not all =0(Po)- 
As a consequence, if ¢) is a non-special value of t, and if f = to + ©, then 


+ 
| 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 269 


Gog, FO(Po), where Go(z3m,° = N(z—&). Since it 
follows, as above for w, that f= do(qo), i.e. to + © =do(qo). Since also 
w =d,(qo) and since we may assume ty 0, we conclude that £==0(qo). 
Thus every element of Yo is in qo, 1. €. Po = qo, Whence Po ts an isolated com- 


ponent of the ideal o-(m—i,:**,r—6¢r). We have therefore proved 
that is an isolated component of 0° if and only if 
Zin) 0(Po). 

If we then denote by Z the h.c.d. of all ideal Zim), as m,° * *, yr Vary 


arbitrarily in o (subject to the only condition that every element in o be 
integrally dependent on m, we conclude with the following 


TueoreM 5. If P is a point on V, and if po is the corresponding prime 
0-dimensional ideal in 0, a necessary and sufficient condition that P be a simple 
point is that ZZ0(po). 


As a corollary we have that the manifold of singular points of Vr 1s 
algebraic, of dimension Sr—1. It is the manifold given by the ideal Z. 


8. Actually the singular manifold can be defined by an ideal which is 
possibly a multiple of Z and in whose construction intervene only the codrdi- 
nates &,,- - -,&» of the general point of V,. Namely, let P be a simple point 
of Po—the corresponding ideal in 0, and let ci(po). We know from 
section 2 that if €; = ++ + i= 1,2,°--, 7, and & =di(po), 
then p, is an isolated component of the ideal (€, - -, — dr), pro- 
vided the coefficients ui; are non-special. Moreover, every element in 0 is 
then integrally dependent on €,,- - -,é-. Hence P is a simple point if and 
only if Z¢,...,&,F0(Po), for general values of the coefficients uij. Now, 
assume that Zz,...,¢, #0(o), i.e. there exists an element w in o such that 
F’, ¥0(po), where F(z;&,-° = N(z—o), the norm of z — o relative 
to the field K(é,,- - -,&-). We assert that there also exists such an element 


w which is linear and homogeneous in &,° - *,&n. In fact, let 
0° (é; — = [ Po, Masi” 


be the decomposition of o- (€,—d,,: + -,&-—d,) into primary (0-dimen- 
sional) components, and let & == cij(pj), i= 1,2,---,n, j= 9,1,° °°, 8, 
where q; belongs to the prime ideal p;. Since p; and po determine two distinct 
points of V,, if 740, namely *,¢nj) and P(¢yo,° *,€no), it 
follows that constants 7,,° - -,@, can be found in such a manner that 


+ + UnCnj + + Dn€no, (j 1, 2, 


270 OSCAR ZARISKI. 


Let € =7,é, The value of the element at P, is then distinct 
from its value at any of the points P,,- - -,P,, and hence by Theorem 4 it 
follows that #0(po), where = N(z—@). 


This proves our assertion. 
From these considerations we conclude immediately with the following 


6. Let & +: t= 1,2,-°°,r +1, be 
r +1 linear forms with indeterminate coefficients wij, and let F(é:,° €rs1) 
= 0 be the irreducible algebraic relation between the €;. Let 8’ be the ideal 
whose basis consists of the coefficients fo(é:,---,&n) of the various power 
products of the ui; in the polynomial F’;,,, (the &s having been replaced in 
F%,,, by the corresponding linear forms in the &s). The submanifold of V, 
defined by this ideal 8’ is the manifold of singular points of Vy. 


9. Let m,°°-,r be elements in o such that every element in 0 is 
integrally dependent on K[m,-° -, yr] and let -,a, be arbitrary con- 
stants. The ideal &—o- (y:—4,°°-°,yr—4r) is unmixed and zero 
dimensional, 


== [qo,1,° Ge] = de, 


where q; is a primary ideal belonging to the zero-dimensional prime ideal }. 
Let be an element in o and let G(z;m,° - +, 4r) = N(z—w) be the norm of 
z—w with respect to the field K(m:,- --,r). We have proved in sections 
5, 6 that if ~p; and if c;(p;), then c; is necessarily a multiple root 
of G(z;a:,-*+,ar). The following theorem gives a lower bound for the 


multiplicity of the root c;: 


TueoreM %. If qi belongs to the exponent pi, then the multiplicity of 
the root c, is = pi. 


Proof. Let o;,: + *,wp» be an integral base of o over the polynomial ring 
We adjoin to the ground field K the indeterminates u,, we, * , 
and we consider the element wy = U0: +: Upon. Let 


N(z— ou) = m,° * Up) = F(z; n; 


be the norm of z—w, with respect to the field K(y;u). It is clear that if 
= = 8, then + is a root of 
F(2301,° * It has been proved by van 
der Waerden [8] that, conversely, every root of F(z;a;u) is a linear form 
ud; -+ upd, dj CK, and that +, dp) is a point of 
the variety whose general point has codrdinates w1,° op In 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 271 


other words: d,,- -,d, must be the values of wp at one of the ideals 
Po, Pi, ° °° > Ps» AS a consequence, F(z; a4; u) is a product of factors 

8 

j=0 

Since o,,° °°, is a base of o, it follows that 

Hence for 77’, the factors z— — CyjUp and 2— Cj 


— CyjUp are distinct. 
We have F(wxu37;u) =0. Reducing the 7 modulo Mf we get the con- 
gruence: F'(wy;a;u) =0(%), or 


This congruence should be intended in the following sense: if w, is 
replaced by Upon, then on the left-hand side we get a poly- 
nomial in u;,° - -, Up, and the coefficients—element of o—of the various power 
products of the w’s belong to Mf. Now, for a given 7, we have 

and the coefficients of this linear form in the w’s do not all belong to py,-, if 
j ~j. Hence, by (24), 

[Wa (or — + Up (op — = 


This implies that all the power products of degree oj in — ¢1),° op 
are in qj. Since w; — ¢1j,° — Cpj and the elements 4; — 9r — Gr 
form a base of p;, and since the elements 7, —41,° * *,4r—4r are also in qj, 
we conclude that pj;71=0(q;). Hence oj = pj. 

Now let » be any element in 0, 


o= Pw, + Pyop, K[m,- a], 
and let = c;(pj), Pi = Pio(M), Pio CK. Then 
whence 


8 
j=0 


272 OSCAR ZARISKI. 


Since ¢,j;Pio + CyjPuo = cj, our theorem follows. 

Do there exist elements » for which c; is a root of multiplicity exactly p;? 
The answer: not necessarily (except when pj = 1, i.e. pj = qj, according to 
Theorem 4). Here is an example. Let 7°; = 72 be the defining equation 
of an algebraic surface and let 74 = ;7/m. Consider the ring 0 = K[m, 72, 73, 94]. 
Every element of o is integrally dependent on m, 2, since 74° = qiy2" (in- 
cidentally, it is not difficult to see that o is integrally closed). The ideal 
(m, 72) is primary, and its prime ideal is po = (m, m2, 7). 


The exponent of 2% is 2, since 
ns” na” == 72793, = 


On the other hand, the field K(m:, m2, 43, ys) is in the present case of relative 
degree 3 over K(m, 72). Since & itself is primary, it follows that if w is any 
element of o and if F(z; m, 2) = N(z—wo), then F(z,0,0) must have a 
triple root. 

II. Properties of the conductor. 


10. Let 0, o* be two finite integral domains in 3. We assume that 0 is 
a subring of o* and that & is the quotient field of 0. The conductor of o with 
respect to o*, in symbols c(o, 0*) —c, is, by definition, the largest ideal in 0 
which is also an ideal in o*. This implies that c is the totality of all elements 
é in o such that £o7* C 0. Every element in o* can therefore be written in the 
form of a quotient 7/é, 7 C 0, and é—any element in c. 

It is well known that cs (0) if and only if every element in o* is in- 
tegrally dependent on 0. The proof is immediate. Namely, assume c ~ (0) 
‘and let € be an element in c, different from zero. The elements in 0 are 
integrally dependent on the ring of polynomials P = K[m,- - -,7,-], where 
m,° °°, are suitable elements in 0. Hence o is a finite P-module (since 
the totality of all elements of % which are integrally dependent on P is also a 
finite P-module; see [7], p. 94). Let o.,- - -,om be a P-basis for 0. Then 0* 
is contained in the finite P-module (w,/€,- - -,om/é) and hence o* itself is a 
finite P-module.. Consequently, every element in o* is integrally dependent 
on P. 

Conversely, assume that every element in o* is integrally dependent on 9, 
whence also on P (in view of the transitivity of integral dependence). Then 
o* is a finite P-module. Let w*,,- - -,o*m be a P-basis for 0*. We can write 
each w*; in the form w;/é, where wi, C0, € being a common denominator 
(since o* is contained in the quotient ring of 0). Hence 9* C 0, € C ¢(0, 0*), 


0, q.e.d. 


A 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 273 


Let po be a prime 0-dimensional ideal ino. We assume that c(0, 0*)=4 (0) 
and we consider the extended ideal o*)p,. Since the elements in o* are in- 
tegrally dependent on o, the ideal o*p, is unmixed and zero-dimensional. 


THEOREM 8. If c(0,0*) 4O0(p.), then o*p, is a prime ideal in o*. 


Proof. Let é be an element in c, not in po, and let €=d(p.), d0, 
dC K. For any element o* in o* we have a relation of the form go* = w C o. 
Reducing this relation modulo we find dw* = c(0*po), where o=c(po). 
Since d~0, we conclude that the ring of residual classes 0*/o*p, is a field 
simply isomorphic to K. Hence o*, is prime (and zero-dimensional). 


11. Inthis and in the following sections we shall derive several properties 
of the conductor in the special case when o* is the integral closure of 


and Here are alge- 
braically independent elements of = and w (C o*) is a primitive element of & 
with respect to the field K(m,- Let n be the relative degree 


K(m,- r)], and let c(w) be the conductor c(0;0*). 
Let p be an arbitrary zero-dimensional ideal in P. We may assume that 


THEOREM 9. <A necessary and sufficient condition that 


form a K-basis of the K-module 0*/o*}, is that the contracted ideal c(w) «P 
should not be divisible by p. 


-1 


Proof. That the condition is sufficient is trivial. In fact, assuming 
t(w) *P0(p), let g be an element in c(w) “P but not in p. Let ¢ be an 
arbitrary element in o* and let 


= Jot gi CP. 


If we reduce g, *;Jn-1 modulo jp, we get, in view of g #0(f): 


as was asserted. 

Somewhat more difficult is the proof that the condition is necessary. We 
shall make use of the following well-known relation between c(w), the com- 
plementary module e of o* and the different G’, of wo: 


(26) c(w) = eG"... 

7 As was pointed out by Schmeidler [6], this relation holds true for fields of 
algebraic functions of several variables, the proof being the same as in the case of 
algebraic functions of one variable. 


274 OSCAR ZARISKI. 


We assume then that for every element { in 0* a congruence such as (25) 
holds true. We can write (25) as follows: 


If we apply the congruence (25), taking as { any of the elements A,,- - -, A,, 
we derive from (25’) a congruence of the form: 


where f; = fio + frrw +° °° + fin-10"", and f1; is a polynomial of first degree 
in m,° °°, Applying repeatedly this procedure we get more generally the 


following congruence: 


where p is an arbitrary integer and fp-: is a polynomial, of degree = n—1 
in w and of degree = p—1 in m,° 

We now consider the complementary module e’ of 0, i.e. the set of all 
elements y in & such that T'(né) (trace of € with respect to the field 
K(m,°°*,mr)) is in P if € is in o. It is well known that the elements 


form a P-basis for e’. Here G’, is the different of G(m,- nr; 0) =0 
is the irreducible equation for o. 

In view of the existence of a finite P-basis in o* and in e’, it follows that 
the traces of all the products nf, » Ce’, £C o*, can be written as rational 
functions in yr with the same denominator. Let h(7) 
be this common denominator. Let us fix an element ¢ in o* and an element 7 
in e’ and let T'(£y) =g(m)/h(m). We apply the congruence (27). The trace 
of the product fp_1-7 is a polynomial in m,° yr, since 7 Ce’. The 
trace of the product will be, by (27), of the form gp(n)/h(y), 
where gp =0($"). We have then the relation: g/h = gp/h, org =hy + 9; 
or finally the congruence 
(28) 9 (4) =0(h(y), 


We separate in the polynomial h(n) the factors which belong to p (i.e. those 
which vanish for - from the remaining factors. We write 
then h(n) =o(n)hi(n), where h,(0,---,0) 40 and o(7) is the product of the 
factors which are =0(). If such factors are not present in h, then we put 
a(n) =1. We proceed to show that from the fact that the congruence (28) 


| 
| 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 275 


holds true for any positive integer p, it necessarily follows that g(7) is divisible 
by 

It is clear that in (28) it is permissible to divide through g and h by any 
common factor of g and h, provided—if that common factor begins with terms 
of lowest degree po—that we replace p by p— po. Hence, to prove our assertion 
it is sufficient to show that g and h cannot be relatively prime unless a(n) = 1. 

Performing, if necessary, a preliminary linear homogeneous transforma- 
tion on the elements 1,° * * , yr, we may assume that h(»), considered as poly- 
nomial in yr, has leading coefficient 1. Since g = hy + gp, we deduce then 
that the resultant R of g and h (considered as polynomials in y,) coincides 
with the resultant of gp and h. Let 


Since gp =0(H"), it follows that Bu, Bu-ps: begin with terms of 
lowest degree not less than p,p—1,- - -,1 respectively. Now let us assume 
that o(n) #1. Then Av begins with terms of lowest degree = 1. We apply a 
theorem on the resultant proved in [9], p. 250. If we attach to Av the weight 1, 
to Bu, Bus,’ +, the weights p,p—1,---,1 respectively, and to the 
remaining coefficients the weight zero, then—as a special case of the theorem 
referred to—every term in the resultant # is of weight = p. It follows that &, 
as a polynomial in 7,° -*, mr-1, begins with terms of lowest degree = p. Since 
p is an arbitrary integer, it follows that R is identically zero. Hence h and g 
have a common factor, q. e. d.'® 


*°Our assertion can also be proved without making use of the properties of the 
resultant. Assume that g and h are relatively prime. Then the intersection of the 
ideals (h, pe), p=1,2,-.-- is at most (r—2)-dimensional, since both g and h belong 
to it. If then Ny» oats Ty» are linear forms in Pe with non special coefficients, 


then the intersection of the ideals (h, 919° He) is at most zero-dimensional. 
We may assume that 1,» rk ee coincide with 7,,---+,%)_» respectively, and we denote 
by the intersection of the ideals (h, Tp» DP), P=1,2,---. Bis at most 


0-dimensional. If o~1, then h= 0(p) and hence § = 0(p). Consequently, 8 is 
0-dimensional and one of its prime ideals is p- If q is the exponent of the corresponding 
primary component of then (see [7], p. 49) = 0(%, pitt) = O(h, ++ DP, 
Since p is arbitrary, this implies = O(h, pitt). It is imme- 
diately seen that this congruence leads to the following absurd conclusion: if h=h, + 
his +..., where h, isa form of degree j, and if h’; = h’; (1,4 n,) denotes the sum of 
terms in h, which depend only on n,_, and %,, then every form in 7, ,, 7,, of degree q, 
is divisible by h’, (or is identically zero, if h’, =0). 

One can also observe that the congruence (28) implies that the hypersurface g = 0 
has at the origin a contact of infinite order with every algebraic curve which passes 


276 OSCAR ZARISKI. 


Since h is the least common denominator of all the traces T’ (fn), € C 0%, 
n Ce’, we conclude by the above result that necessarily hfA0(p). Now if 
T (fn) =g/h, then T(- yh) =g. From this it follows hyn Ce, y arbitrary 
in e’”. In particular h-(1/G’,) Ce, i.e. hC eG’. Hence, by (26), 
h=0(c()), and since h4£0(H), our theorem is proved. 

12. We are now in position to prove the converse of Theorem 8, always 


under the assumption: 90 = K[m,- o*-integral closure of o. 


THEorEM 10. If po is a prime zero-dimensional ideal in 0 and if o*po is 
prime in o*, then 0*) 4O0(po). 


Proof. We may assume that 7: =0(po), -,7, and also that 
wo =0(p.). Then po = (m,42,° 9r,0). Let V, be the variety given in 
Sr. by the defining equation G(m,-°- -,7r,0) =0 and let A be the point 
(0,0,- - +,0) of V, which corresponds to po. Let M be the subvariety of V, 


defined by the conductor c(o,0*). A general line on A will intersect V,, 
outside of A, in points which are not on M. Subjecting the elements 
m,* ° *,;7r,@ to a preliminary linear homogeneous transformation, we may 
assume that the line =: = 0 is general in the sense just specified. 
Let [qo,4:,° *,4s] be decompositions of the ideal into 
0-dimensional primary components, )o, ~:,° )s—the corresponding prime 
ideals. Our assumption concerning the line 7, ‘=, = 0 implies that 
c(o, o*) #0(pi), Let o=ci(pi). The constants 90, 
*,¢s are distinct, since pi = By Theorem 8 it 


follows that the ideals 
p*, == o*p; 0*(m,° 47,0 — (i= 1,2,---,s) 
are prime. By our assumption, the ideal 
= = 0*(m,° 0) 
is also prime. From this it follows: 


(29) [p*, p*,, p*, | p* p*, p*, o*(m, II (w — ci)). 

Let now ¢ be an arbitrary element in o* and let £=d;(p*i), i=0,1,°°°s. 
We can find a polynomial f(o), with constant coefficients (of degree Ss 
in »), such that f(ci)}—d;. For such a polynomial f(w) we will have 
£—f(w) p*s]), or, in view of (29), 


through the origin and lies on the hypersurface h =0. Hence every such irreducible 
algebraic curve must also lie on the hypersurface g =0, which shows that the two 
hypersurfaces g = 0 and h =0 must have a common component. 


( 

Q 
id 
| 
8 

i 

F 

| 

| 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 277 


Applying to the elements «*; and £* the considerations just applied to ¢, we 
find that ¢ satisfies a congruence of the form 


where f; is a polynomial (of degree <1 in m>°’ °*,r)- More generally, 
we find 


where fp is a polynomial in , m,° - -, yr, Whose degree in » may be assumed 
to be =n—1. Now if p+ 1 is sufficiently high, then p* =0(0*q;).1" 
Hence, for p sufficiently high, we will have 


or, letting fp(w;0,- -,0) + ew +: ++ C K, 


Such a congruence holds true for any element ¢ in o*. Hence, by Theorem 9, 
we must have c(o0,0*) *P40(p). This implies c(o, 0*) 40(pi), += 0,1, 
‘, 8, in particular c(o,0*) 40(p.), which proves our assertion. 


13. In the preceding section we have characterized in o the prime zero- 
dimensional divisors ~) of the conductor c(0,0*), by means of the decom- 
position of the extended ideal o*p,. Namely, ), is a divisor of c(o, 0*) if and 
only if the zero-dimensional ideal o9*p, is not prime in o*. This is a relative 
characterization. It is of interest to give an intrinsic characterization in o* 
of those zero-dimensional ideals p*, which are extended ideals o*), of ideals 
pb) in some subrings of o*, such as 0. The question is then the following. We 
consider subrings 0 of o* which satisfy the following conditions: (1) o* is the 
integral closure of 0; (2) 0 is generated by r + 1 elements, 0 = K[m,-° ++, qr+1] 5 
(3) o and o* have the same quotient field. Given a prime zero-dimensional 
ideal p*, in o*, we ask: under what condition does there exist a subring 0 
such that p*, = o*p,, where pp = p*,®0? We proceed to prove that such a 
subring o exists if and only if the rank of the K-module p*,/p*,? is not greater 
than r +- 1,8 


Proof. Consider, quite generally, any zero-dimensional ideal p*, in o*, 
and let p be the rank of the ring p*,/p*,’, considered as a K-module. Let 


** Since 9*p; = p,*, the ideal 9*q, is primary and belongs to the ideal p,*. Namely, 
in the first place o*q; is zero-dimensional. Let p’* be one of its prime ideals, and let 
p'**9 =p’. Also p’ is prime and zero-dimensional. Since p’* > g,, we have p’ D qy 
whence = Consequently p’* D = = p*p ie 

* Hence, in particular, if the point which corresponds to ,* on the variety deter- 
mined by the ring g*, is simple, because then the rank is r (see Theorem 3.2). 


, 


278 OSCAR ZARISKI. 


*,@p be a K-basis of p*,/p*o?. We have evidently: wp, )*,?) 
= p*,. From this we deduce (see section 2) that p*, is an isolated component 
of the ideal 0*(w;,- --+,wp). Since p*, is zero-dimensional, we must have 
p=r (see footnote 3). , 

Now let us assume that the rank p is Sr+1. Let 0* = K[&,- - -, &] 
and let us assume, as usual, that p*, = (é,,° - -,&é,). We can then take as a 
K-basis for p*,/p*,” p linear forms in the é’s. We may assume that &,,- - -, & 


p 
form such a K-basis. Let => cij&j(p*o?), t= 1,2,---,n. The rank 
j=1 


of the matrix (cij) is p. Given p forms yj = > ujiéi, they will form a 
i=l 


K-basis for p*,/p*,? if and only if the determinant of the p by p matrix 
(uji) * (Civ) is 540. We consider the two possible cases: p=r, p=r-+1. 

Let p=r. We choose the coefficients u;; so that the above determinant 
be 0 and that, in addition, every element of 0* be integrally dependent on 
m,’ ° *;9r- Then p*, is an isolated component of the zero-dimensional ideal 
o*(m,°°*, nr). Let +, be the other prime ideals of 0*(m,- - nr). 
We choose in o* a primitive element » (with respect to the field K(m,---, 7,r)) 
such that t= 1,2,-°-,s. Then = nr, 0) 
and the ring 0 = K[m,- - -, r,@] satisfied all our conditions. 

Let p=r-+41. We first choose the coefficients of the first r rows of the 
matrix (uji) - (civ) in such a manner that their matrix be of rank r and that 
every element in o* be integrally dependent on m,°-~-,r- With this choice 
of the first r rows we are certain that when Uryi,1,° * * , Ursi,n have non-special 
values, then the determinant of the (r+ 1)-row matrix (uji) (civ) is #0. 
Let [q*o, q*1,° be the decomposition of the zero-dimensional ideal 
0*(m,° into primary components. Let p*o, p*s be the corre- 
sponding prime ideals. Again, for non-special values of the coefficients Ur41,i, 


the element = D will satisfy the conditions: 
i=1 


i=1,2,:--,s (since the n elements é; are not all =0(p;), if iA). 
Finally, for non-special coefficients u,,1,; the element 7,,, will be a primitive 
element of o* with respect to the field K(m,---,mr). If we choose the 
coefficients u;,1,; so as to satisfy all these conditions, it will follow that 
(5° "ry = and that the ring = K[m,- qrs1] satisfies all 
our requirements, q. e. d.’® 


19 The example at the end of section 9 illustrates the possibility p > r+ 1. In that 
example we have r=2, f,* = (n,, 7, 13, 7,4), and it is easily seen that the four 
elements 7; are linearly independent modulo p,*?. Hence p= 4. 


{ 
i 0 
V 
18 
th 
é, 
tiv 
C00 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 279 


III. Normal varieties in the affine space. 


14. In part I we have given several characterizations of simple and 
multiple points of a V, in Sn, mostly of ideal theoretic nature. We have seen 
that these characterizations are not so much properties of the set of codrdinates 
°°, of the general point of V,, as of the whole ring 0 = K[&,- - -, én], 
namely of the zero-dimensional ideals in o. If m,- °° qm is another set of 
elements in o such that 0 = K[m,- - -, ym], then the 7s are the codrdinates 
of the general point of a variety W, in an Sm, birationally equivalent to V;. 
The points of V, and W, are in (1,1) correspondence without exceptions 
(everything being confined to the points in the affine space, i.e. at finite dis- 
tance), since their points are in (1,1) correspondence with the prime zero- 
dimensional ideals in 0. To simple points of V, correspond simple points of 
W,, and conversely. Topologically speaking, V, and W, (points at infinity 
excluded) are homeomorphic loci (“ open” or “ relative” circuits). We shall 
say that V, and W, are integrally equivalent, alluding to the fact that they 
are both defined in terms of one and the same integral domain o in the field %. 
Thus o determines a class of birationally equivalent varieties any two of which 
are integrally equivalent. 

We shall say that an algebraic variety V; in an affine S, is normal in the 
affine space, if the codrdinates &,,- - -,é, of the general point of V, give rise 
to an integral domain 90 = K[&,---,é,] which is integrally closed in its 
quotient field. In the sequel we will speak of the normal variety determined 
by an integrally closed finite integral domain o in the field 3, meaning by this 
any one of the integrally equivalent varieties determined by 0. 


15. We prove now the following 


Lemma. Let V, be an irreducible r-dimensional algebraic variety in Sn, 
+, én] the corresponding integral domain. If Vp is a p-dimen- 
sional irreducible algebraic subvariety of V, and pp is the corresponding 
p-dimensional prime ideal in o, then a sufficient condition that Vp be a non- 
singular manifold of V, (i.e. that not all points of Vp be multiple points of 
V,) is that there should exist r—p elements m,° °°, yr-p in 0 such that Pp 
ts an isolated component of the ideal 5 yr-p)- 


This lemma generalizes the property which we have used in section 2 for 
the definition of a simple point (p = 0). 

For the proof we consider the ring of residual classes 0 =0/Pp. If 
denote the elements of 6 which correspond to Trespec- 
tively in the homomorphism o = 0, then it is clear that @,,:°°,é&n are the 
codrdinates of the general point of Vp. We take a point P of Vp which does 


280 OSCAR ZARISXi. 


not lie on either one of the following two subvarieties of Vp (both of dimen- 
sion =p—1): (1) the variety of singular points of Vp (the points of V, 
which are multiple for Vp) ; (2) the variety in which Vp intersects the possible 
other components of the variety V’ defined by the ideal -,mr-p) 
(since Pp is an isolated component of 0: (m,° °°, mr-p), Vp itself is one of 
the irreducible components of V’). Let fo and p> be the prime 0-dimensional 
ideals in 0 and o respectively which correspond to the point P, regarded either 
as a point of Vp or as a point of V,. Clearly pp =0(po) and po— fo in the 
homomorphism between 0 and 0. Since P is a simple point for Vp, it follows 
that the rank of the K-module is p (Theorem 3.2). Let 5 
be a K-module basis of o/o?, and let yr-p.1,* * +, r be any set of p elements 
in o belonging to the residual classes 7jr-p.1,° °°, 7r respectively. Any element @ 


p 
in 0 satisfies a congruence of the form 6=d,.+ Dd dinr-psi(Po?), di CK. 
i=l 
Since the largest ideal in 0 which is mapped upon ,? in the homomorphism 


0 ~ 0 is the ideal (),”, pp) it follows that any element w of o satisfies a con- 


gruence of the form 
(30) w= 2 dinr-psi(Po°s Pp)- 


Now, in view of our hypothesis concerning the ideal 0- (m,-° ° -,mr-p) and 
the point P, it follows that p, does not divide any of the primary components 
of 0: (m,° nr-p) distinct from pp. Hence there exists an element « in 0 
such that «A 0(po) and app =0(m,°- r-p). Hence for any element ¢ 
in Pp we have: a = Aim +: - - + Ar-pyr-p, where A;Co. If then 
% = Co(Po), Co AO and Aj =cj(Po), CK, it follows immediately, 
since the elements £,,° * -,yr-p are in )o, that a congruence of the form 


Col = Cim +° + Cr-pyr-p (Po*), 
holds true for any element £ in pp. From this, in view of (30), we conclude 
the r elements 7,° °°, form a K-module basis for po/po?, which is then of 
rank r. Hence P is a simple point of V,. Thus it is proved that Vp contains 


points which are simple for V,, q. e. d. 
The above lemma implies as an immediate consequence the following im- 


portant property of normal varieties: 
THEOREM 11. The singular manifold of a normal variety V, in an affine 
space is of dimension S r— 2. 


Proof. It is sufficient to prove that every (r — 1)-dimensional subvariety 
V,-. of V, is non-singular. Let p,, be the prime ideal of V,_, in 0. Since 


| 
j 
is 
4 0 
| or 
i th 
| of 
he 
(y 
4 CO! 
of 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 281 


o is integrally closed, the primary ideals belonging to the minimal ideal p,-1 
are the symbolic powers po of Pri, 1—=1,2,°--.?° Let » be an element in 
pri but not in p'*). Then, by the principal ideal theorem in the integrally 
closed ring 0, it follows that p,;_; is an isolated component of 0: ». Our theorem 


now follows from the above lemma in the case p=7—1. 


16. If V,is not normal, whence the corresponding ring 0 = K[é&,---,én] 
is not integrally closed, the passage to the integral closure o* of o defines a 
birational transformation of V, into the normal variety V*, defined by o* 
(or into any other variety V’, integrally equivalent to V*,). We shall say 
that V*, is the derived normal variety of V,. For lack of a better word we 
shall use the term “ integral closure ” to denote the birational transformation 
which carries V, into V*,. By Theorem 11, we may say that the effect of the 
integral closure is the elimination of all singular mantfolds of dimension r—1. 
Thus for algebraic curves, the integral closure transformation resolves all the 
singularities of the curve (singularities of finite distance). For an algebraic 
surface the integral closure resolves the multiple curves (at finite distance). 
A normal surface in the affine space has only a finite number of singularities 
(at finite distance). 

One more remark concerning the birational correspondence between V, 
and V*,. If P is a point of V,, pp—the corresponding 0-dimensional prime 
ideal in 0, then o* is also 0-dimensional and unmixed. If = q*s], 
where each q*; belongs to a prime 0-dimensional ideal p*; in o*, and if P*; 
is the point on V*, determined by p*;, then to the point P of V, there 
correspond on V, the points P*,,- - -,P*;. On the other hand, every prime 
0-dimensional ideal p*, in o* determines uniquely its contracted ideal 
Po =o%p*, ino. Hence to a point on V*, there corresponds a unique point 
on V,, By Theorem 8, if p, does not divide the conductor of 0 relative to 0%, 
then s 1 and o*p, is prime in 0*. Now assume that P is a simple point 
of V,, and let é,,- - -,&- be uniformizing parameters for the whole neighbor- 
hood of P, where we assume that €;==0()p,.) and that every element of 0 
(whence also of o*) is integrally dependent on é,,- --,é- By Theorem 4, 
there exists an element in o such that FO(~o), where 
is the norm of z—vo over Since po = 0, 
it follows that 4O0(p*:), and hence, by Theorem 5, we 
conclude that P*; is a simple point of V*, and that p*; is an isolated component 
of the ideal o*(é,,- - -,&-). Since o*p, divides this last ideal, it follows 


*° See van der Waerden [7], p. 105. 
*1 See Muhly and Zariski [3]. 


3 


282 OSCAR ZARISKI. 


a fortiori that each p*; is an isolated component of o*p,, whence q*; = p*, 
and o*po = [p*.,- - +, p*s]. Now at each point P*; we have a definite iso- 
morphic mapping 7; of o* upon a subring H; of the ring of formal power 
series of These s mappings must coincide on 0. In fact, since 
&,,° - °,&, are uniformizing parameters for the whole neighborhood of P on 
V,, we have a mapping + of o upon a subring H of K{é,---,é,-}. Let w be 
any element of o and let 


Then = + ++ m arbitrary, and consequently also 
then also wo = go” + +. (p*o™"). In view of the unique- 
ness of the polynomial +- dn‘, it follows = dm, and this 
proves our assertion r= 7; on 0, 1—1,2,:--,s. Since o and o* have the 
same quotient field 3, it follows that these mappings also coincide on 0%, 
whence necessarily s = 1, 
Reassuming, we have the following 


THEOREM 12. If V*, is the derived normal variety of Vr (in the affine 
space), then to each point of V*, there corresponds a unique point of Vz, 
while to every point P of V, there correspond at most a finite number of 
points of V*,. This number can be greater than one only if P is a singular 
point of V, and lies on the subvariety defined on V, by the conductor of 0 
relatiwe to o*.?* 


This theorem shows that the birational transformation between V, and 
V*, is free from fundamental points on either variety. 


IV. Normal varieties in the projective space. 


17. A normal variety in the affine space may have singularities at in- 
finity. Concerning these, Theorem 11 gives us no information whatever. It 
may very well happen that a normal V, has a singular V,_, at infinity. Hence, 
from a projective—and consequently also from an algebro-geometric point of 
view—Theorem 11 is not significant. Also the relationship between a V, and 
its derived normal variety V*, in an affine space has no invariantive character 
from an algebro-geometric standpoint. Thus, for instance, the birational 


*2 The preceding proof shows that every point of the variety defined by the con- 
ductor is necessarily a singular point of V,.. The converse is of course not always true. 
On the other hand, it should be noted that to P there may correspond a unique point 
of V,.* even if P lies on the variety of the conductor. Namely, we may have 9*p, = q’) 


primary (not prime). 


| 
4 
i 
| 
| 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 283 


correspondence between V, and V*, may have fundamental loci at infinity. 
We shall therefore now deal with the extension of the notion of a normal 
variety to projective spaces. 

Let Yo, Yi," °°; Yn be homogeneous codrdinates in an n-dimensional pro- 
jective space P,. The quotients = yi/yo, 1 = 1, -,n, are point codrdi- 
nates in the affine S, consisting of those points of P, which are not on the 
hyperplane y= 0. Let be an irreducible r-dimensional variety in Pn, 
not in the hyperplane yo =0, and let &,---,é, be the codrdinates of the 
general point of V, in Sn. The field 3 = K(&,°°*,&:) is of degree of 
transcendency r over K. The variety V, is defined in P, by the H-ideal 
(homogeneous ideal) in K[ yo,---, Yn], generated by all forms yn) 
such that f(1,é,-°- -,&) = 0. 

Now suppose that we choose as hyperplane at infinity another hyperplane, 
Say CoYo + ++ + = 0, which does not contain the variety V,, and 
let S’, be the affine space consisting of the points of P,» which are not on 
this hyperplane. If, say, c; A 0, then the n quotients yj/(Co¥o +° * + CnYn), 
j=0,1,---,i—1,1+1,---+,mn, can be taken as codrdinates in S’n. The 
coordinates of the general point of V,, considered in S’n, are then 
The ring of polynomials in the @; contains also the quotient 

€i/ (Co Cnén) 
since ++ + = 1 and c, ~0. Thus, inasmuch as we regard V, 
as a variety in the affine space S’,, it determines in & the integral domain 
(31) = K[1/(eo + Cnn) » 
E,/ (Co + + Cnén) ° En/ (Co ++ Cnén) |. 


We go back to the homogeneous codrdinates. If we regard Yn 
as non-homogeneous codrdinates in an affine S;,,;, then the homogeneous ideal 
defines an (r+ 1)-dimensional irreducible variety Wri: in Sn. The 


codrdinates €*,,- - -,é*, of the general point of W,,, are the residual 
classes in K[y]/$ containing yo, y1,° °°, Yn respectively. If we imagine Pn 
as being the hyperplane at infinity of S,,,, then W,,; is the hypercone which 
projects V, from the origin yp = Yn = 0. 

Let K(&*,, €*n) be the field of rational functions on 
Its subfield K(£é*,/é*o,° €*n/&*o) is simply isomorphic with the field 
%=K(é,,- - -,&), and in this isomorphism to é*;/é*, there corresponds the 


element €;. This follows immediately from the relationship between the homo- 
geneous ideal $$ in K[y] and the defining ideal of V, in K[a]. If then we 
identify é*,/é*, with &:, we may regard & as a subfield of >*. The degree of 


284 OSCAR ZARISKI. 


transcendency of =* is one greater than that of 3. Since é*; = &&*, (in view 
of our identification), we have = %(é*,). Hence is a simple trans- 
cendental extension of %.°* Thus we have an invariantive relationship between 
= and &*, independent on the particular projective model V, of the field 3%. 
We note the existence of the group of relative automorphisms 7 of &* with 
respect to =; these are all of the form: 
+8 a — By A. 

Such a relative automorphism is described by a birational transformation of 
the hypercone W,,, into itself which leaves invariant each generator. 

Consider the special automorphisms & — & 3 > té*o, CK. We 
shall say that an element w* in 3* is homogeneous of degree v, if rt(w*) = t’o*, 
¢t arbitrary in K. Any homogeneous element w* of degree v belonging to the 


ring o* = K[é*o,: - -,é*n] is a form of degree v in &*o,: - &*n. In fact, 
let w* = fp (é*) + (€*) + fo(é*), where fi (é*) is a form of degree 
ain &*,,- - -,&*, and fp 40. We must have: 


= U’fp(E*) + +: + Ufo (é*), t—arbitrary. 


This implies p= vy, 

The elements of & are homogeneous of degree 0. Conversely, any element 
of &* which is homogeneous of degree zero, is an element of %. Namely, it is 
clear that the elements of 3* = %(é*,) which are left invariant by all the 
relative automorphisms é*, — ¢té*, are necessarily elements of %. 

In terms of the field }* we are now in position to define quite generally 
the notion of the homogeneous codrdinates of the general point of an algebratc 
variety V, in the projective space Py: 

Any set of n + 1 elements in 3*, say £*o, +, €*n, are homogeneous 
coordinates of the general point of V, in P, if: 

(a) 

(b) The field K(£*, - -,¢*n) is a transcendental extension of 2%. 
The condition (b) implies that each element ¢*; is transcendental with respect 
to 3. A particular set of homogeneous codrdinates is given by the elements 


&*,, &*1, , The most general set of homogeneous coordinates 
£*, +, (relative to our fixed codrdinate system Yn in Pn) 


is obtained by multiplying the codrdinates é*; by any element o* in &*, pro- 


33 That is transcendental with respect to 2 (= K (€,*/&,": follows 
also from the fact that all the algebraic relations between are homo- 
geneous (or consequences of homogeneous relations). 


‘ 

f | 
q 
| 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 285 


vided, however, that o*é*; be transcendental over 3. Thus, we may also say 


that any set £*5,- - -,£*, of homogeneous codrdinates of the general point of 
V, is of the form: p*, p*é,,- p*én, p* an element of which is trans- 


cendental over 3. An immediate consequence of this is the following: the 
correspondence &*; —> £*; sets up an isomorphism between the two rings K[é*] 


and K[é*]. 


18. Derinition. Let &*,&*1,° (or o*€*o,° +, 0*8*n, 0% C 
be homogeneous coordinates of the general point of a Vr in Pn. The variety 
V, shall be said to be normal (in P,), if the ring o* = K[é*o,- - +, &*n] (or 
the ring K[o*é*o,- - -, 0*E*n]) is integrally closed in its quotient field. 

It is clear that the above defining property of a normal variety V, is 
independent of the choice of the factor of proportionality o*, in view of the 
remark at the end of the preceding section. 

When we speak in the sequel of a normal variety, it will be understood 
that the variety is normal in the projective space. 

Let us take any given hyperplane coyo CnY¥n =O in Py as hyper- 
plane at infinity, and let 8’, be the corresponding affine space. We assert that 
if V, is normal, then it is also normal in the affine space S’, (provided, of 
course, that V, does not lie entirely in the preassigned hyperplane at infinity). 
We have to show that the ring (31), or what is the same, that the ring 


is integrally closed in 3. Let & cié*; = 7*, and let » be an element in & 


izo 
which depends integrally on o’. Then it is clear that there exists an integer p 
such that w: (7*)* depends integrally on =K[é*),- --,é*nJ. Hence 
(n*)? o%, 


where f is a polynomial. Now w- (7*)? is an homogeneous element of degree p. 
Hence f is necessarily a form of degree p. It follows that wo = f/(y*)? is a 
polynomial in , €*n/n*. Hence » C o’, and this proves our 
assertion. 

THEorEM 11’. The manifold of singular points of a normal variety Vr 
is of dimension < r — 2. 

This follows from Theorem 11 and from the fact that our V, has just 
been proved to be normal in the affine space S’n, for any choice of the hyper- 
plane at infinity. 

In this connection we wish to point out that if, conversely, a Vr has the 
property that it is normal in the affine space S’n, for every choice of the hyper- 


286 OSCAR ZARISKI. 


plane at infinity, then it does not yet necessarily follow that V, is normal in 
the projective space P,. We prove namely the following 


THEOREM 13. In order that a V, in Py be normal in the affine sense, for 
every choice of the hyperplane at infinity in Pn, tt is necessary and sufficient 
that the conductor of the ring 0* = K[é*o,- - +, €*n] with respect to the in- 
tegral closure 0* of o* (in the quotient field of o*) divide a power of the 
0-dimensional prime ideal = (&*o,- - &*n) (the vertex of the hyper- 
cone W,,;). 

If o* is not itself integrally closed, the conductor must be then a primary 
ideal belonging to p*. 

Proof. The condition is sufficient. For let the conductor c(0*,0*) bea 
primary ideal q* belonging to p*, and let o be the exponent of q*. If 7* is 
any linear form in é*o,- - -, &*, and if, as before, w is any element in & which 
is integrally dependent on 0’ = K[é*o/n*,- - -, €*n/n*], then, for some in- 
teger p, the product w- (7*)? is in 0*. Hence w- C o*, since (7*)? 
is in c(o*,0*). We conclude, as before, that w- (7*)**? is a form of degree 
p+o in &*.,---,é*, and that consequently » Co’, i.e. o’ is integrally 
closed in 

The condition is necessary. Suppose that o’ is integrally ‘closed in &. 
Let us consider any homogeneous element w* of 0*. Let w* be homogeneous 


of degree v, and let 
be an equation of smallest degree for w* over o*, with leading coefficient 1. 
The equation remains true if we apply any automorphism: é*; — (éi, 
o* + t’»*. The resulting equation must be an identity in ¢. Hquating to 
zero the coefficient of t””, we get an equation similar to (33) but homogeneous 
of degree vm. Hence we may assume that are homogeneous of 
degrees v, 2v,-- vm respectively, i. e. a; is a form of degree iv in é*o,- &*n. 
From (33) it now follows that the element w*/(7*)’, which is homo- 
geneous of degree zero and is therefore an element of %, is integrally de- 
pendent on 0’ = K[é*,/n*,- €*n/n*]. Hence w*/(n*)’ C0’, i.e. 


wo*/(n*)” = f (E*o/n*,° f a polynomial. 
Clearing the denominator, we find 
(n*)? = 9 p= 0, 


where g is a form of degree p+. Thus, given any linear form 7* in 
&*,,- + >, &*n, there exists an integer p (perhaps depending on »*) such that 


w* + (n*)? Co*. In particular, let (€*;)* C o* and let = 1+ (pi—1). 


hor 


| 
i 
\ 
t 
| 
bi 
d 
0 
m 
Uj 
K 
a 
aut 
| 


it 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 287 


Then it is clear that w* C o* for any linear form in é*o,- &*n. 
Hence w* p*,°’ C 

Now it is not difficult to see that every element in 0* is a sum of homo- 
geneous elements of 0*.** Since 0* possesses a finite o*-basis, it follows that 
there also exists an o*-basis in 0* consisting of homogeneous elements, say 


w*;,’ ° °,@*g. We have just proved that for each w*; we can find an integer 
oi such that w*;- o*. If we put o = max(o,° then we will 


have 0*- p*,7 C o*. Hence c(o0*, 0*) p*,%, and this proves our theorem. 
We shall have occasion to point out later (see footnote 26) that there 
actually exist varieties which satisfy the condition of the last theorem and 
which yet are not normal. 
19. There is an important connection between the notion of a normal 
variety V, and the concept of a complete linear system of V;_,’s on V,. We 


begin with some simple remarks concerning the order of V,. 


Let &*,, - be homogeneous codrdinates of the general point of 
V, and let &; = &*;/é*, be the non-homogeneous codrdinates. The order vy of 
V, is ordinarily defined as the number of distinct intersections of V, with a 
general (n —r)-dimensional subspace of Py. Let 


Ni = Uio + +° Uin€n; (4 1, + 
where the w;;’s are indeterminates. Clearly, the above definition of the order 
v of V, is equivalent with the following: v is the relative degree of (wij) 
with respect to the field K(m,-+-, r3 wij) (see [7], p. 82). Now, with respect 
to this last field, there always exists in (wij) a primitive element of the form 
= + + Where the coefficients u,,:,4 are “ non- 
special ” constants in K. It follows that if all the wi;’s are indeterminates, 
the irreducible equation G(m,° Wiz) between m,° 18 Of 
degree vy in yr41. For reasons of symmetry it also must be of degree v in each 
of the variables »,°--°,y,. Finally, since it is permissible to operate on 
by non-singular linear transformation, it follows that G 
must be of degree v in all the arguments nr. 

This is so as long as the ui;’s are indeterminates. Now we specialize 
Wij > CK, If the polynomial In 


K[m°,* does not vanish identically, then we get an algebraic rela- 


2[é,*] is integrally closed in x* = 2Z(é,*), it follows that any element in 9* is a 


Polynomial in £* with coefficients in 2. Let now w* ‘be any element in 9*, o* = 
* > > istine sts > 
%+ae*+..., %8, a, C and let t,t,,---,¢, be s +1 distinct constants. The 


automorphisms 7, : &;* >t,£;* leave 9* invariant. Hence the s +1 elements w,* = 
Phd a,+ tag)" +--+ tea,€ "8 also belong to 9*. From this it follows that the 


homogeneous components of w* are in 


288 OSCAR ZARISKI. 


tion between which is of degree =v. In the contrary case we 
still get an equation of degree =v between qrs1°, provided that we 
specialize the ui; one at a time and divide, when possible, by factors wi; — ui;°. 
Consequently, any r+ 1 linear polynomials in the & satisfy 
an algebraic relation of degree =v. In particular, if r of the elements »j° 
are algebraically independent, then the irreducible algebraic relation between 
of degree =v. It is equal to v, when the coefficients u;;° 
are not special. 

Let V’, be another algebraic variety, birationally equivalent to V, and 
lying in a projective n’-space Py, n’ >n. Let &*,- - +, én be the homo- 
geneous coordinates of the general point of V’,. Here the é*’; as well as the 
é*; are elements of the field }*, a simple transcendental extension of 3. The 
following is self-evident: V, is a projection of V’, if &*o,- + + ,&*n are pro- 
portional to linear forms in with coefficients in K. Assuming, 
as it is permissible, that the é*; are linearly independent, we may so choose 
the codrdinate system in P,, that be proportional to , 

We now define: The system of hyperplane sections of a V; in P, is said 
to be complete, if V, is not the projection of a V’, of the same order as V, 
and belonging to a space P,, of dimension n’ greater than n.”° 

When we say that V’, belongs to Pn» we mean that V’, does not lie in any 
subspace of Py». Similarly we suppose that V, belongs to Pn. 

We now prove the following 

TuroremM 14. The system of hyperplane sections of a normal V, 1s 


complete.”° 


2° The usual procedure is to give directly the general definition of a complete linear 
system of V,_,’s on V,. The property on which our definition is based becomes then a 
consequence of the definition as applied to the special case when the system of V,__,’s is 
the system of hyperplane sections. We reverse the procedure. Dealing with an 
arbitrary linear system | V,_,| of V,_,’s on a V,, we would first transform our ,. 
into a V’, on which the system | V,_,| is cut out by the hyperplanes, and then our 
definition is applicable, provided V, and V’, are birationally equivalent. The case in 
which the correspondence between V,. and V’, is (a,1), a > 1, arises when the system 
| V,_, | on V,, is composite with an involution of degree a. This case would then require 
a separate treatment. 

2° This explains our term “normal.” In the algebro-geometric literature a variety 
is called normal if its system of hyperplane sections is complete. However, it should 
be pointed out that while a variety, normal in our (arithmetic) sense, is also normal 
in the above geometric sense, the converse is not true. For instance, a curve may be 
normal in the geometric sense and still have singularities (example: a plane quartic of 
genus 2). Such a curve cannot be normal in the arithmetic sense, in view of Theorem 
11’, section 18. 

On the other hand, a curve may be free from singularities in the projective space 
and not be normal in the geometric sense—for instance—a rational space quartic. Such 


j 
a 
| 
il 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 289 


Proof. Assume that V,, of order vy and belonging to Pn, is the projection 
of birationally equivalent V’, of the same order v, lying in a Py, n’ >n. Let 
& é*, and be the homogeneous codrdinates of the 
general point of V, and of V’, respectively. Then &*,- - -,é*, are pro- 
portional to Let — p* C += 0,1,---,n. The 
n’ +1 elements £*o, >, nu,’ are proportional to 
the codrdinates é*’;, and moreover generate the field since = K(€*o, 
-++,&*,). Hence these n’+ 1 elements can also be taken as homogeneous 
coordinates of the general point of V’, We may therefore assume that 
= We drop the primes in the remaining n’ —n 
codrdinates, so that now the codrdinates of the general point of V’, are 

To prove our theorem we have to show that V’, actually belongs to Py, 
le. that are linearly dependent on Let 
& = &*;/é*). Subject to a preliminary linear transformation on - &n 
we may assume that &,---,é, are algebraically independent and that the 
relative degree [%: K(é,- - -,&-)] is equal to v. Let us now consider one 
of the elements €n41,° Say the element &:,;. We can find constants 
5 Cny 0, such that = O16: + is a primi- 
tive element of = with respect to the field K(é,---,&-). Let then 


n+1 
be the irreducible equation for én,, over K(é,-°-°-,é-). Since V’, is also 
of order v, the above equation must be of degree v in all the arguments 
Hence A, is a constant, and therefore €,,, is integrally 
dependent on K[é&,- Since V, is normal, the ring - -, én] is 
integrally closed, and consequently Since 
= C18) -f- Cnsi€ne15 Cn+1 0, 
it follows that &.,, is in K[&,--°-,€,]. Passing to the homogeneous coérdi- 


nates we conclude that there exists an integer hy such that 


In a similar fashion we show the existence of an integer h; such that 


If then h = max(ho, +, hn), then 
EF ni C KE, - 


@ curve is normal in the affine space, for every choice of the hyperplane at infinity (see 
footnote 22), but since it is not normal in the geometric sense, it is a fortiori not normal 


in the arithmetic sense. 


290 OSCAR ZARISKI. 


It follows that if we put A= (h —1)(n +1) + 1, and if »* is an arbitrary 
form in of degree A, then C K[é*,- é*n]. Since 
&*,,.9* is homogeneous of degree 4+ 1 (with respect to the automorphism 


&*, it follows that is a form in &*o,- -, &*n, of degree A+ 1. 
Let ,,° be the various power products of -,é*n, of degree A. 
If we apply the above result to 7* = wi, we find: 

where the are linear forms in é*o,- -,&é*n. Hence | hij — 8ijé* nui | = 0, 
where 6;; = 0 or 1 according as i~j or i=j. It follows that é*,,, is in- 


tegrally dependent on K[&*o,- - -,é*n], whence K[&*o,- 
since V, is normal. Now &*n,, is homogeneous of degree 1, whence &*n,, is 
necessarily a linear form in - &€*n. In a similar fashion it follows that 
are linear forms in é*o,- - -, and this completes the proof 


of our theorem. 


20. We now proceed to establish the existence of normal varieties in any 
given class of birationally equivalent varieties. Specifically we shall show 
that for any given V; in P, it is possible to define a class of derived normal 
varieties—an extension of the analogous notion in affine spaces (see section 16). 

Let V, be an irreducible variety in Py, and let =K(&,-- én), 
= €*n), where the é*; are the homogeneous coérdinates of 
the general point of V,, and & = Let o* = -, €*n] and 
let 0* be the integral closure of 0* in &*. o* is a finite integral domain, say 

o* = K[¢*,,- 
Every element in 0* is a sum of homogeneous elements also belonging to 0*, 
and an homogeneous element of 0* which is not a constant is necessarily of 
positive degree. Hence we may assume that each ¢*; is homogeneous and 


of degree §; >0. The integers 8,,- - -,8, are not necessarily distinct; let 
d;, d2,* - +,dq be the distinct integers among them and let d be their |. c. m. 


We consider in 6* the homogeneous elements whose degree is a given multiple 
of d, say od. Every such element is a sum of power products of the ¢*;, each 
power product being an homogeneous element of 0*, of degree od. If in 4 
given power product of the ¢*; there are g, factors of degree of homogeneity 
d;, g2 factors—of degree of homogeneity dz, etc., then we must have 


(34) gids + Gadq = od. 
Thus the determination of the homogeneous elements of o* of degree od 
depends on finding all the non-negative solutions g,,- --,gq of the above 


diophantine equation. Now suppose that eq. Then, for at least one of 
the integers - *,gq we must have g;=d/di. If, say, = d/d,, then 
(9g: — d/d,, gq) is a solution in non-negative integers of the equation 


Ad 
i 
Bi 
ABS 
‘pen 
4 
i 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 291 


(34) with o replaced by o—1, and the system (91, 92,° * *,9q) is the sum 
of the two systems 
(91 — d/d,, go, 9a)> (d/d,, 0,° *,0), 
of which the second is a non-negative solution of the equation (34) for o—1. 
By repeated application of this reduction we conclude that if o=q, then 
every solution of (34) in non-negative integers g; can be expressed as the 
sum of a non-negative solution of the equation 
gids Gada (q—1)d 

and of s—q-+ 1 non-negative solutions of the equation g,d, - gqdqg=d. 
If we now put 6 = m- (¢ —1)d, m—an integer = 1, we can assert that every 
non-negative solution of the diophantine equation 

gid, gqdq = pd, p—an arbitrary positive integer, 
is the sum of p non-negative solutions of the diophantine equation 

gidy +++ gadg= 8. 

This property is shared by any integer 8 which is a multiple of (q¢q—1)d. 
Actually it is very likely that d itself enjoys this property, but we have no 
proof for this conjecture. 

Now consider any integer 8 with the above property. Let w*o, +, 
denote all the possible power products of £*,,- - -, ¢*:, whose degree of homo- 
geneity in 0* is equal to 6. By our choice of 8 it follows that any power 
product of £*,,- - -,¢*, whose degree of homogeneity in o* is a multiple 
pd of 8 is necessarily a power product of w*,: - -,*m, of degree p. Hence, 
every element in 0*, homogeneous of degree ps, p—an arbitrary positive im- 
teger, can be expressed as a form of degree p in w*o,* * +, 0*m. 

We shall call character of homogeneity of our V, in P, any integer 6 which 
enjoys this last mentioned property. Any multiple of (q—1)d is certainly 
a character of homogeneity of V,. 

Let 6 be a character of homogeneity of V, and let, as before, o*,, wo”, 

be all the possible power products of which are homo- 
geneous elements of 0*, of degree 8. The elements w*; can be regarded as the 
homogeneous codrdinates of the general point of an algebraic irreducible 
variety V’, in a projective space °°; Y’'m). The variety is 
birationally equivalent to V,. Namely, the quotients i= 1, 
are homogeneous elements of &*, of degree zero, and hence are elements of &. 
Consequently K(w*,/w*o,° On the other hand, we have 


The elements é*,é*,°1 and é*,° are homogeneous of degree 8, and consequently 


can be expressed as linear forms in w*,,° *,o*m. It follows that 


292 OSCAR ZARISKI. 


and consequently K(w*,/w*o, +, Hence the two fields 
K(é,,° and K(w*,/w*o,- *m/w*,) coincide, and this proves that 
V, and V’, are birationally equivalent. Note that the equations of the bira- 
tional transformation between V, in Pn(yo, Yn) and V’, in 
Ym) are, by (35), of the form 
p—a factor of proportionality. 


The linearity of these equations signifies that V; is a projection of V’, 
We assert that the variety V’, is normal (in its ambient projective space 
Pm). We have to show that the ring K[w*o,- - -,o*m] is integrally closed 


in its quotient field. We first point out that since w*o,- - -,o*m are homo- 
geneous elements of degree 8, every element w* in %*, which depends integrally 
on +, *m, is a sum of homogeneous elements whose degrees are multi- 
ples of 6 and which are also integrally dependent on w*o,- - -,w*m. In view 
of the transitivity of integral dependence, the homogeneous components of o* 
are in 0*. Hence they are forms in w*,- --,o*m, and consequently 


C K[w*o, rot, w*m |, q. e. d. 


21. In the construction of the normal variety V’, there occur arbitrary 
elements, for instance the character of homogeneity 8. We thus get a whole 
class of normal varieties associated with V, Any variety of this class shall 
be called a derived normal variety of V,. We wish to investigate the relation- 
ship between any two derived normal varieties of V;. 

A first arbitrary element in our construction is the choice of the elements 
such that o* = K[¢*,,---,¢*,]. This choice affects the 
elements +, w*m, Which are the various power products of the ¢*;, of 
degree 8. However, since the elements o*; always form a linear base for the 
homogeneous elements of degree 8 in 0*, it is clear that two derived normal 
varieties of V, belonging to one and the same character of homogeneity of 
V, are projectively equivalent. 

Let now V’, and V”, be two derived normal varieties of V, belonging t0 
two distinct characters of homogeneity & and 8” respectively. Let Py and Py 


be their ambient projective spaces respectively. Finally, let w’o,- - -, 0p (0 
wo," be the homogeneous codrdinates of the general point of Vr 


(or V”,). We observe that if 8 is a character of homogeneity of V,, then any 
multiple of 8 is a character of homogeneity. We may therefore consider the 
derived normal variety M, of V, belonging to the character of homogeneity 
88”. Let +, be the homogeneous codrdinates of M,, its ambient 


i 
q 
i 
| 
| 
a 
/ 
] 
[ 
] 
Ht 
C 
bi 
fi 


SOME RESULTS IN THE ARITHMETIC THEORY OF ALGEBRAIC VARIETIES. 293 


space being a P». The w* are forms of degree 8” in +, and every 
form of degree 8” in wo, is necessarily a linear form in w*y,° , w*m. 


It follows that M, is obtained from V’, by referring projectively the hyper- 
surfaces of order 8” in Py to the hyperplanes in Pm. (We assume that 
wo, °°, p» are linearly independent, so that V’, does not lie in a subspace 
of Py; we make similar hypotheses for V”, in Py and M; in Pm). Ina similar 
manner it follows that M, is obtained from V”, by referring the hypersurfaces 
of Py of degree & to the hyperplanes of P». We have therefore the following 


THEOREM 15. The birational correspondence between V’, and V”, has 
the property that the linear system of sections of V’, with the hypersurfaces 
of order 8” of its ambient space is transformed into the linear system cut out 
on V”", by the hypersurfaces of order & of its ambient space Py. 


This result implies, in particular, that the correspondence between V’, 
and V”, is (1,1) without exceptions. It is free from fundamental elements 
on either variety. We connect up this result with the notion of the quotient 
ring at a point of a Vy. 

Let &*,- - -,&*n be the homogeneous codrdinates of the general point 
of a V, in and let A(do, be a given point of V,. We define 
as the quotient ring of the point A, in symbols: Q(A), the set of all elements 
f(é*)/g(é*) in & (f, g—forms of like degree) such that g(a) #0. In other 
words, Q(A) consists of all elements of % which have a definite finite value 
at the point A. The quotient ring Q(A) is independent of the choice of 
coordinates in P,. In particular, if, say, a) ~ 0, and if we pass to non-homo- 
geneous coordinates & &*;/E&*o, = ai/do, then Q(A) is the quotient ring 
of the prime ideal py = (€, — , — 2n°) inthe ring K[&,---, én]. 

Let now W, be another variety in a Pm, birationally equivalent to V,, and 


let be the homogeneous codrdinates of the general point of W,. 

Let 

(37) py s=fil€ 0> n)» » Mm), 


be the equations of the birational transformation between V, and W, (the 
fi-forms of like degree; similarly for the ¢;). Assume that the point A on 
V, is not fundamental for the transformation. Then the quantities 
fi(do, an) are not all zero, and there corresponds to A a unique point 
B(bo, Bm) on W,, where phi = fi (do, If g(n*o,* 
is a form such that g(bo,- 0m) #0, then it is clear that g(fo,- - hm) 
will be a form in &*),- - -,é*n which is 40 at A. It follows that Q(B) 
[Q(A). If we also assume that B is not a fundamental point on W,, then 
we may conclude likewise that Q(A) © Q(B), whence Q(A) = Q(B). 
Finally, we point out that if o—K[&,---,é»] is integrally closed in 


294. OSCAR ZARISKI. 


its quotient field X, and if p is any prime ideal in o, then the quotient ring 9, 
is also intgrally closed in . Reassuming, we may state the following 

THEOREM 16. The quotient ring of any point P of a normal V, is in- 
tegrally closed in the field of rational functions on V,. The birational corre- 
spondence between the points of two derwed normal varieties V’, and V", 
of one and the same V,, is (1,1) without exceptions. The quotient rings of 
any two corresponding points P’, P” of V’, and V”, respectively, coincide. 

We conclude with a final important remark which clears up geometrically 
the relationship between a V, and its derived normal varieties V’,. 


Let V’, belong to the character of homogeneity 8 and let w*o,- - -,o*m be 
the homogeneous codrdinates of the general point of V’, Any form in 
é*,,- +, &*n of degree 8, is necessarily a linear form in w*o,- , 


Hence in the birational correspondence between V, and V’,, to the sections 
of V, with the hypersurfaces of order 8, there correspond hyperplane sections 
of V’,. In other words: if we denote by V,-, the hyperplane sections of V,, 
then the system of hyperplane sections of V’, is the complete system | 8V,. |. 
Thus we have the following 

THEOREM 17%. The derived normal varieties V’, of a given Vr are those 
on which the hyperplanes cut out the complete system | 8V,-. |, where 8 is a 
character of homogeneity of V,. 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES 


1. Grell, H., “ Zur Theorie der Ordnungen in algebraischen Zahl- und Funktionen- 
kérpen,” Mathematische Annalen, vol. 97 (1927). 
2. Krull, W., “ Idealtheorie,” Ergebnisse der Mathematik und ihrer Grenzgebiete, IV, 
3, Berlin, Springer (1935). 
3. Muhly, H. T. and Zariski, O., “The resolution of singularities of an algebraic 
curve,” American Journal of Mathematics, vol. 61, no. 1 (January, 1939). 
4. Ore, O., “ Uber den Zusammenhang zwischen den definierenden Gleichungen und der 
Idealtheorie in algebraischen Kérpen, I,” Mathematische Annalen, vol. 96 
(1926). 
Riickert, W., “Zum Eliminationsproblem der Potenzreihenideale,” Mathematische 
Amnalen, vol. 107 (1933). 
6. Schmeidler, W., “ Grundlagen einer Theorie der algebraischen Funktionen mehrerer 
Veranderlichen,” Mathematische Zeitschrift, vol. 28 (1928). 
7. van der Waerden, Moderne Algebra, II. 
8. van der Waerden, “Zur algebraischen Geometrie III. Ueber irreduzibile alge 
braische Mannigfaltigkeiten,” Mathematische Annalen, vol. 108 (1933). 
9. Zariski, O.,“ Generalized weight properties of the resultant of n + 1 polynomials in 
n indeterminates,” Transactions of the American Mathematical Society,” vol. 41, 
no. 2 (March, 1937). 
10. Zariski, O., “ Some results in the arithmetic theory of algebraic functions of several 
variables,” Proceedings of the National Academy of Sciences, vol. 23, 00. 7 
(July, 1937). 


or 


j 
4 a 
| 
He 
if 


ON SIMULTANEOUS EXPANSIONS OF ANALYTIC FUNCTIONS 
IN COMPOSITE POWER SERIES.* 


By A. C. BuURDETTE. 


Introduction. The purpose of the first part of this paper is to prove that 
the system (1.1) of linear non-homogeneous functional equations (generalized 
difference equations) has solutions g.(x), k =1,2,-- -,n, analytic in a cer- 
tain finite region provided that the known functions v= 1, 2,:--,n, 
are analytic in an appropriate region and provided that the system is non- 
singular in the sense defined in § 1. 2. 

In the second part of the paper we obtain facts concerning the convergence 
of series of the form (2.3) which Carmichael’ has called composite power 
series and apply the local solution of the first part to obtain the simultaneous 
expansion of analytic functions in composite power series. 

The method employed is that used by Carmichael (C) in treating the 
foregoing problem in the case where the independent functions are integral. 
The results of the paper will include, as a special case, part of the results 
obtained by Carmichael in the paper cited. 

I. On the Local Solution of a System of Linear Generalized 
Difference Equations with Constant Coefficients. 


1.1. Formulation of the problem. We consider the problem of solving 


the system. 


(1.1) —avj) == fy(z), (v= 1,2,:--,n), 
j=1 


of generalized difference equations where the functions v= 1,2,- -,n, 
are analytic in the neighborhood of some appropriate point, say z == b, and 
have there the power series expansions 


(1.2) = (v= 1,2," +,m). 


j=0 
For convenience in later work we define » to be the least number satisfying 


the relations 


* Received April 18, 1938. 
*R. D. Carmichael, “ Systems of linear difference equations and expansions in series 
of exponential functions,” J’ransactions of the American Mathematical Society, vol 35 


(1933), pp. 1-28. This paper will be denoted hereafter by (C). 
295 


r 

y 

n 
ne 

§ 

| 


296 A. C. BURDETTE. 


(1. 3) lim sup | =~, < (v=1,2,---,n). 
jroo J: 

The coefficients cvj and the additive terms avj are given complex constants 
which will later be subjected to certain negative conditions in order to avoid 
exceptional cases in the theory of the system. We seek to determine functions 
gx(z), k=1,2,:--,n, analytic in the neighborhood of «= 6b, which are 
solutions of the system (1.1). 

Numerous writers’ have treated related problems and special cases of 
the one set here. A recent paper by I. M. Sheffer * deals with a problem very 
closely related to the one under consideration. He obtains a local solution 
for a single equation of the type involved in the system (1.1). He also 
obtains an expansion for a single function similar to the simultaneous 


expansions of the second part of the present paper. 
1.2. Symbolic Operators.* We define the operator F(a) by the relations 
E(a) f(x) =f(« +a), 
(3 (ax)} f(a) anf 
ali (a) -BE(b) = BE(a +), 


where @, dx, %, %, b, B are constants. These definitions give unique meaning 
to any polynomial combination of operators (az). Such a polynomial in 
operators # may be written as a linear function of suitably defined operators FL. 
In particular, one may define such an operator by means of the symbolic 
determinant 

(1. 4) A =| |, 


this being, by definition, the symbolic operator obtained by expanding the 
determinant formally as if its elements were ordinary algebraic quantities. 

Any polynomial combination of operators E will be said to have the value 
zero when and only when the result of operating with it upon the function e”* 
is identically zero when considered as a function of f. 

Making use of the symbolic operators just introduced we may write the 
system (1.1) in the form 

*See Carmichael (C), p. 2, for a list of some of the important papers in this 


connection. 

M. Sheffer, “A local solution of the difference equation Ay(a) =F and of 
related equations,” Transactions of the American Mathematical Society, vol. 39 (1936), 
pp. 345-379. 

‘ These symbolic operators are those used by Carmichael (C). We define them here 


for the convenience of the reader. 


Sur 


SIMULTANEOUS EXPANSIONS OF ANALYTIC FUNCTIONS. 297 


(1.5) (— ar) gi(a) = fol), (v= 1,2," 
j-1 


The symbolic determinant in (1.4) will be called the symbolic determinant 
of the system (1.1), or (1.5). This determinant will be called singular 
when it has the value zero, otherwise it will be called non-singular. 

We shall treat the system (1.1), or (1.5), only in the case when its 
determinant is non-singular. In that case we shall say the system is non- 
singular. 

Since A is assumed to be non-singular, it may be written in the form 


(1.6) = > (ax), 
| 


where the constants c, are different from zero and the constants ax are dif- 
ferent. The case o—1 is trivial; therefore in what follows we assume o > 1. 


1.3. Lemmas and definitions. The following lemma, used by Sheffer 
(1.c.), will prove useful in the work to follow: 


LEMMA 1.1. Let {ox} be a finite set of points in the finite plane. There 
exists a unique circle of smallest radius which covers the set, i. e., includes all 
points of the set in its interior or on its boundary. 


This unique circle associated with the set {o} will be denoted by the 
symbol y{ox}. 

With a given set {avx} we shall associate the set {oa} defined by 
ov; = Av,—b. If the center of the circle y{ax} is at the point b, the 
center of y{ vx} is at zero; the radii of the two circles are the same. In the 
definitions and lemmas to follow we shall use the set {oavx}. 

We define the function h(t) by the relation 


(1.7) h(t) = e**tAett = c,e! + coemt ++ - - + coeret, 


where A is the symbolic determinant of (1.5) for the set {oa}. If we factor 
em! out of h(t) we have left 


(1.8) h(t) + ++ + ++ 


where c,; ~ 0 and odx% = odx—d;. For a function of the type h(t) Carmichael 
(C) has proved the following lemma: 


Lemma 1.2. There exists an infinite set of contours Ty, v= 0,1, 
such that there exists a positive number ¢ such that | h(t) | > for all t on 


n 
o 


298 A. C. BURDETTE. 


Ty, v=0,1,2,: +--+. Moreover the contours Ty are such that they all contain 
t= 0 in thew interior; for v greater than some preassigned number N the 
distance from t =0 to a point t on Ty is not less than v nor greater than 
v+B8 where B is a sufficiently large positive number; the length of Ty bears 
a. bounded ratio to 2xv. 


We shall also have occasion to make use of the functions 


i=1 
where Ay, is the cofactor of the element in the v-th row and k-th column of 
the symbolic determinant A used in (1.7); Py, S (n—1)!; and the sets of 
constants d;‘*), 9a,°*) are functions of the sets of constants {cvx}, {av} 
respectively. Let « be defined as the least number satisfying the relation 


(1.9) at1= | oa’) — oa, |, (all i, v, k). 
Then by means of Lemma 1. 2 we may state: 
Lemma 1.3. For t on Ty, v>N, 
| A ext | < Me(o?) (+8) 
for all m and k, where « is defined by (1.9) and B is the constant of Lemma 1. 2. 


1.4. Solution of the system (1.1). We shall consider (1.1) in the 
symbolic form (1.5). Let p and b denote, respectively, the radius and center 
of y{avx}. We may without loss of generality assume 6 —0. This amounts 
to considering the system (1.5) formed on the set {ax}. However, for 
convenience, we retain the same notation. 

Form the functions 


where the coefficients sy; are those appearing in (1.2); h(t) is the function 
defined in (1.7) ; the contours Ty are those of Lemma 1.2; Avg is the cofactor 
of the element in the y-th row and k-th column of A; and i= V—1. By 
application of the Lemmas 1.2 and 1.3 one readily obtains that the absolute 
value of the j-th term of (1.10) is dominated by 


n 
Me) (3+8) (4+B) 7-4 | | 


| 


for j > N. Hence for p sufficiently small (» being the constant defined by 
(1.3)) the series (1.10) converge absolutely for 


a 
P 
T 
de 


SIMULTANEOUS EXPANSIONS OF ANALYTIC FUNCTIONS. 299 


| | < log 


and therefore define functions analytic in the neighborhood of x = 0. 

The series (1.10) are readily seen to afford a formal solution of the 
system (1.5). If we impose the further restriction on p, i.e., on the series 
(1.2), that 

log > p 


then the circles of convergence of the series defining gx(« — ax), 
vk =1,2,--+,n, have a region in common in the neighborhood of z = 0. 
Therefore, under these conditions, the functions gz(a) defined by (1.10) 
afford an actual solution of the system (1.5), or (1.1), and we have: 


THEOREM 1.1. Let p and b denote, respectively, the radius and center 
of the curcle y{am} where the constants avy are such that b =0, and let « 
denote the constant defined in (1.9). If the functions f(x), v=1, 2,---,n, 
are analytic throughout the interior of the circle |x| yp where p is such 
that log ep" > p, the functions gx(x), k =1,2,---+,n, defined by (1.10) 
are analytic in the neighborhood of x =0 and satisfy the non-singular system 
(1.1) im @ region containing that point in its interior. This region includes 
the open region common to the set of circles |2—an| = log ep", 
vy, 


As a consequence of this theorem we have: 


THEOREM 1.2. Let p and b denote, respectively, the radius and center 
of the circle y{av}, and let « denote the constant defined in (1.9). If the 
functions fv(x), v= 1,2,--+,n, are analytic throughout the interior of the 
circle |2—b|—yp- where u is such that log ep" > p, there exists functions 
k= 1, 2,---,n, analytic in the neighborhood of x =b which satisfy 
the non-singular system (1.1) ina region containing that point in tts interior. 
This region includes the open region common to the set of circles 


| — dv | = log ep", (v, == 1,2,- +,n).%? 


*It is evident that this theorem gives the best results when the a, used in defining 
4 is a member of the set {,,} such that its distance from the center of the circle 
1{.%,(%)\ is a minimum. This is purely a question of the choice of notation. 
°If «4 =0 the functions fvy(#) are integral and we obtain Carmichael’s (C) result 
p. 14. 
"By the same method a similar theorem can be obtained for the more general 
system 
n my 
7 =F, (2) (v=1, 2,---m). 
r=l 
The only changes necessary in such a development are obvious modifications of the 
definitions of the auxiliary functions and constants. 


300 A. C. BURDETTE. 
II. Simultaneous Expansions of Analytic Functions in Composite 
Power Series. 


2.1. Formulation of the problem. For n > 1 we consider the problem 
of expanding n functions fy(x), v= 


analytic in the neighbor- 
hood of a point xb, simultaneously in composite power series, i.e., we 
consider the problem of representing these functions in the form 


Co nN 
(2.1) fv(z) = — av;)*, (v= 1,2,- -+,n), 
k=0 j=1 
where the coefficients %; are independent of both z and v. We seek expansions 
2.1) in which the coefficients a,j; are such that the series in the equations 


k=0 


converge in the neighborhood of 0. We subject the constants c¢vj;, av; to 
the condition that the determinant A(t), whose element in the v-th row and 
j-th column is cvje”s*, shall not be identically zero in t¢. 

Under the conditions named we shall show that for suitable functions 
fv(z) such expansions exist. 


2.2. Convergence of composite power series. Before continuing with 
the problem set in the preceding section we shall obtain certain convergence 
properties of series of the form 


(2.3) — ax)’. 

v=0 k=1 
We assume that the a, are n different constants; that c, 40, 
that an infinite number of each set ax, k—1,2,--+,n, are different from 


zero; and that 
lim sup | = re, (0<m% < 


There is no loss in assuming 7 > 0 because if any rz —0 the corresponding 
series 

co 

Dd (2 — ae)” 

p=0 


converges throughout the finite plane and consequently has no effect on the 
region of convergence of. (2.1). 


E 


SIMULTANEOUS EXPANSIONS OF ANALYTIC FUNCTIONS. 301 


Let the series of circles 
| — ax | = 1x7, 


be called the fundamental set of circles, and let the set of points composing 
the boundaries of the set of circles defined by the relations 


| =r; | (1,7 = 1,2,°--,n), 


be called the exceptional set. 

By means of the convergence properties of ordinary power series and by 
considering the absolute value of the terms of (2.3) over a subsequence of 
v=0,1,2,- - -, we are led to the theorem: 


THEOREM 2.1. Let C denote the closed region consisting of all points 
common to all members of the fundamental set of circles (boundaries included). 
Then 

(1) The series (2.3) converges absolutely at each interior point of C; 
it converges absolutely and uniformly in any closed region interior to C; 


(w) The series (2.3) diverges at every point exterior to C and not in 
the exceptional set; 


(wt) If the region C is vacuous there are no points of convergence of 
the series (2.3) save possibly points of the exceptional set. 


Examples can readily be constructed to show that there may be points 
of convergence in the exceptional set and outside C. Examples can also be 
constructed to show that (111) of the theorem may exist. 


2.3. Expansions of analytic functions. If we employ the notation 
defined in (2.2) we may write (2.1) in the form 


(2. 4) = evjgj — aj), (v= 
j=1 


Suitable solutions of this system evidently lead through (2.2) to the required 
expansions (2.1). The condition put on A(t) in § 2.1 is just that required 
to make the results of the first part applicable to the system (2.4). Let the 
notation of § 1.1 be carried over to this section. Then we may state: 


THEOREM 2.2. Let avj, cvj, v, 7 =1,2,°°+,n, be two sets of complex 
constants such that A(t) 0; let p and b denote, respectively, the radius and 


302 A. C. BURDETTE. 


center of the circle ooruigas let « be the constant defined in (1.9). If the 
functions v= 1,2,- are analytic throughout the interior of the 
circle |x—b|—yp where p is such that log ep" > p, they have simultane- 
ous expansions of the form (2.1) valid in a region including + =b in its 
intertor. This region includes the open region common to the set of circles 


| = log v, 1,2,°--,n).8 


The formula (1.10)° affords a means of obtaining suitable coefficients 
%j to be used in expansions (2.1). If we define Ay.(t) to be the cofactor of 
the element in the v-th row and j-th column of A(¢), then coefficients a; are 


given by 


Tm 


(j =1,2,-: *,n; k=0,1,2,: °°). 


UNIVERSITY OF CALIFORNIA. 


®* By means of the generalization indicated in 7 it is clear that simultaneous ex 
pansions of the form 


n 
k=0 j=1 r=1 
may be obtained. 
* It is to be recalled that (1.10) is formed with respect to the set {o%x}- 


SOME THEOREMS ON QUASI-ANALYTICITY FOR FUNCTIONS 
OF SEVERAL VARIABLES.* 


By S. BocHner and A. E. Taytor.? 


Introduction. The notion of quasi-analyticity may be described roughly 
as follows. A class of functions defined in a fixed domain is said to be quasi- 
analytic if within the class an individual function is completely determined 
by its behavior in an arbitrarily small sub-region of its domain of definition, 
or, what is sometimes the same thing, if the function is determined by the 
knowledge of its value and that of all its derivatives at a single point. 

In the present paper we shall exhibit various ways of realizing, for definite 
classses of functions of several variables, this notion of quasi-analyticity. All 
of our results depend upon the following theorem of Denjoy and Carleman, 
which has to do with functions of one real variable.? 


THEOREM D. Let f be a real function of the reai variable x, defined and 
possessing derwatives of all orders on the interval aSa=b. Then tf 
f{™(a) =0 (n=0,1,-- -) and tf for all values of x the inequalities 


are satisfied, where the constants my are such that the series 
is divergent, then f is identically zero. 


It is evident from this that for a fixed sequence of positive constants {mn} 
such that 3mn'/" is divergent the class of functions f defined on (a,b) and 
such that 

| Arms (n—=0,1,---) 


where A is a constant depending only on f, is quasi-analytic in the sense that 
if f and g are in the class, and f(a) = g(a), then f = g. 

When one seeks to establish some similar result for functions of several 
variables a number of different modes of procedure present themselves. If we 


* Received July 28, 1938. 
* National Research Fellow. 
*T. Carleman, Les Fonctions Quasi Analytiques, Paris (1926), p. 20. 


303 


304 S. BOCHNER AND A. E. TAYLOR. 


consider, to begin with, Euclidean spaces only, we may, with a fixed domain R, 
restrict the magnitude of the function and all its partial derivatives in R. 
Or we may perhaps require restrictions on certain combinations of these deriva- 
tives only, that is to say, on certain differential operators applicable to f in R— 
the successive iterates of the Laplacian, for example. Again, the conditions 
f‘" (a) =0 may have as their analogues in the generalization conditions on 
the function and its derivatives at a single point, or they may be replaced by 
similar conditions relating to a point set in R with the property that a func- 
tion which is analytic in & and vanishes on the set must vanish identically. 
We have made use of each of these alternatives, and it is on the basis of them 
that the paper is divided into two parts, the second part dealing with certain 
differential operators, especially the Laplacian. The theorems of the second 
part, it will be seen, are essentially theorems on functions of several variables, 
being genuinely different in character from those of the first part. 

We have also considered Riemannian spaces; in this case likewise we have 
resorted to generalizations of two different kinds. Theorems 3, 4, 5, 7 (Part I) 
and Theorem 10 (Part II) deal with Riemannian spaces. In particular, 
Theorem 10 is concerned with functions defined on a space of constant positive 
curvature, the unit sphere 


a ke = 1 


in Euclidean space of &+ 1 dimensions, k= 2. In its formal statement 
Theorem 10 differs from Theorem D merely by the substitution of A”f for 
f' (x), 4 being the Laplace-Beltrami operator for the sphere, and the sub- 
stitution of a point set of specified character for the single point z =a. 

In order to avoid repetitions in the statement of our theorems, all of 
which have the same general form, we shall agree at the outset on the following 
conventions: a) f shall denote a real, continuous functions defined in some 
domain (= open set) of a space; it shall possess continuous derivatives of all 
orders with respect to the codrdinates used. b) {mn} n=0,1,- - - shall 
denote a sequence of positive constants, arbitrary except for the conditions 
imposed in the statement of the theorem in which it occurs. c) Statements 
involving the index n shall hold for n =0,1,- - - unless the contrary is ex- 
plicitly asserted. d) If (2,,- - -,2) are the codrdinates of a point, we shall 
frequently refer to it as the point x; if f is a function of 2,- - +, x we shall 
often express this by the notation f(x). e) Hx shall denote the Eucildean 
arithmetic space of k dimensions; V; shall denote a Riemannian space of I 
dimensions. We shall require merely that the components of the fundamentél 
metric tensor gi; for the space Vx be of class C®; that is, they admit co- 
tinuous partial derivatives of all orders. We stress the fact that the 9’s need 


not be analytic. 


SOME THEOREMS ON QUASI-ANALYTICITY. 305 


Part I. 
1. If f is defined over a domain of E; we introduce the symbols 


Do(f,%) = | f(z)| 


(1. 1) anf 
where >, are summed independently from 1 to k. 


THEOREM 1. Let f be defined in a connected domain R in E;,, and let x° 
be an interior point of R. Then the conditions 


a) Di(f,z) Sm, 
b) Da(f, 2°) =0 


c) = (the series is divergent) 
imply that f is zero throughout R. 


Proof. Selecting an arbitrary straight line emanating from 2° and a point 
v on this line such that the segment 2°2’ lies wholly in R, we have as a para- 
metric representation of this segment 
(2’; 


== ays + = s+ 


where d is the distance from x° to 2’ and s is the distance from x° to x(s), 
so that 
k 
(1. 2) Sa? =1. 
p 
Define 
F(s) =f(a1(s),° +, =f(a(s)). 


Then clearly 


(s) = Sida,’ * * ) 


where 
for... = 
By (1.2) and the Cauchy-Schwarz inequality we find 


and so also F‘") (0) =0. Therefore by Theorem D, F(s) = 0 and in particular 
f(z’) =0. 


Since R is connected it is obvious that a finite number of applications of 


ll 
n 
k 
al 


306 S. BOCHNER AND A. E. TAYLOR. 


the above reasoning allows us to conclude, as in the process of analytic con- 
tinuation, that f is zero at all points of R. 

Theorem 1 is perhaps the most obvious generalization of Theorem D. The 
generalization may be pushed even further, to include functions defined on an 
arbitrary Banach space * ZH. For this purpose we introduce the following con- 
siderations. If H is a Banach space with elements z,y,---, and f is a real 
function defined in a domain F# in F, we shall say that f is differentiable arbi- 
trarily often along all lines in F# if the quantities 


ds” 


exist for each z in # and each y in FL. It is readily seen that if H is the 
Euclidean space FE, we have 


Oa, Ya, Yan 


and consequently 
| (x; SDa(f, yi?}"”. 


This leads us at once to the statement of our next theorem. 


THEOREM 2. Let f be defined in a connected domain R of a Banach space 
E, and let f be differentiable arbitrarily often along all lines in R. Then if 
x is a point in R the conditions 


a) cin yin 
b) df(2°;y) 
c) > 
imply that f is zero throughout R. 
Proof. If 2 is chosen as in the proof of Theorem 1, and we put 


then 
FO) (5) — f(a? + s(a/ —2°) —2°) 


and by conditions a)-c) and Theorem D we conclude that F(s) =0. The 
rest of the proof is then obvious. 


2. In order to formulate theorems of the desired type for Riemannian 
spaces it is first of all necessary to devise something corresponding to the 


* For the definition of such spaces see Benach, Opération Linéaires, Warsaw (1932), 
Chapter V. 


SOME THEOREMS ON QUASI-ANALYTICITY. 307 


quantities D,(f,x) entering in Theorem 1. It is desirable to define D,(f, x) 
in invariant form; covariant differentiation naturally suggests itself. How- 
ever, if we keep formulae (1.1) in mind and think in terms of normal codrdi- 
nates, it is another process which occurs most readily as the basis for our pro- 
cedure. This process is that of extension. All that we need of it for our 
purposes is the following. If f is a function defined in V; with 2-codrdinate 
system, and Zp is a fixed point, we introduce normal coérdinates (y,- - - , y*) 
with origin at this point. Then if 


we define 
= 


The functions fo,...a, defined by (2.2) constitute the components of an abso- 
lute tensor, covariant of rank n.* 

Since we shall make extensive use of normal coédrdinates it will be con- 
venient to assemble for reference the principal facts on which we shall rely. 
In the neighborhood of any given point of V;, it is possible to introduce normal 
codrdinates y* with origin at the point in question. If f is a scalar function 
defined over a portion of V; we shall use f* to denote this scalar as a function 
of normal codrdinates about a chosen point. We shall use yi; to denote the 
components, in normal codrdinates, of the fundamental metric tensor of the 
space, and Cag’ to denote the Christoffel symbols in normal codrdinates. The 
normal codrdinates may be so altered by a linear transformation that at the 
origin of the system 


(2.3) Wis (0) = (0) = 856. 
The equations of a geodesic through the origin are 
(2. 4) == ats 


where s is the distance along the geodesic, measured from the origin, and the 
a’s are constants satisfying the relation 


(2.5) 


Along the geodesic the following fundamental relations hold.° 


‘T. Y. Thomas, The Differential Invariants of Generalized Spaces, Cambridge 
University Press (1934), p. 97. 
*T. Y. Thomas, op. cit., p. 85-87. 


308 S. BOCHNER AND A. E. TAYLOR. 
== dis (0) y'y? = das (y)y'y’ 
Wis (0) = Was (y) 

(2. 7) === (), 


(2. 6) 


With the quantities (2.2) we define 


=| f(z)| 
Dy(f, 2) = anf Br... Ba} 


Using normal coérdinates with origin at z, and taking account of (2. 2), (2.3) 


2 % 
y=0 
Thus D,(f,z) is a real, non-negative scalar. 
We are now in a position to enunciate a theorem. 


(2. 8) 


we see that 
onf* 
dy™ dy™ 


= 4 | 


THEOREM 3. Let f be defined in a connected domain R of V;. Then, 


Lo being a point in R and D,(f,x) being given by (2.8), f is identically zero 
provided that conditions a)-c) of Theorem 2 are fulfilled. 


Proof. Consider a geodesic sphere of radius r in FP with center at x, and 
a fixed geodesic passing through x. Along this geodesic f becomes a function 
F(s) of the distance s from a. Let x, be an arbitrary point on the given 
geodesic, within the geodesic sphere. Then if we choose normal codrdinates 
with origin at 2, we have equations similar to (2.4) for the geodesic, s being 


replaced by s —s,, where s, is the value of s ata—a,. Clearly 


and therefore, by (2.5) and the Schwarz inequality 
| F™ (s,)| S Da(f, 21) S mn. 


Hence also F'”)(0) =0. The reasoning then follows that of the previous 
proofs. 


TuerorEM 4. The hypothesis and conclusion here are the same as tM 
Theorem 3, except that in the definition of Dn(f,x) [formula (2.8) ], fa... 
shall signify the components of the n-th successive covariant derivative of f. 


Proof. Consider as before a geodesic sphere with center at 2», and the 
function F(s). It is clear that it is sufficient to prove that for an arbitrary 
geodesic through 2, the inequalities 


— 


us 


in 


SOME THEOREMS ON QUASI-ANALYTICITY. 309 
(2.9) | F™ (s)| S Dy(f,2) 


are satisfied, where x and gs refer to the same point. We shall first turn our 
attention to a lemma independent of the above considerations. 


LemMMA 1. Let S bea region in V;, which is entirely covered by a system 
of normal coordinates y* with origin in the region. Let f be a scalar defined 
in S. If the geodesics through the origin are defined by (2.4) subject to 
condition (2.5), so that the variables (s,a) furnish a fixed set of “ polar” 
coordinates covering S, we define 


F(s;a) =f*(y). 
In the region S with the origin excluded the derivatives 


(s;a) 


0s n 


(s;a) = 
are then seen to be scalars. As such they are given by the formulae 
where 2; are the components of the covariant derwative of the scalar 
(2. 11) 
and f*q,...a, are the components of the n-th successive covariant derivative of f*. 


Proof of the lemma. We have 


_ 
Now 
00 of* 
dy? f a 


and, remembering relations (2. 6) 


G2 Wis (O)y’ 
dy* 2 2 
from which follows 


(2. 12) 20 


Therefore (2.10) has been proved for n = 1. 
Replacing the scalar F(s;a) by F’(s;a) and applying the result just 
obtained we get 


310 S. BOCHNER AND A. E. TAYLOR. 


= jf* pa + 212, 


But this will agree with (2.10) if 


To prove (2.13) we write the left side in the form 


y* (y? yj 
which is possible because of (2.12) and the fact that the covariant derivative 
of the tensor yi; vanishes. But the last expression above on the right is zero 
because of the fundamental relations (2.6), (2.17), as is easily verified. 
Proceeding in this way, by repeated use of (2.10) for n 1, and the 
relation (2.13), we conclude (2.10) in general. 
If in (2.10) we fix the a’s and allow s to approach zero we find, in view 
of (2.12) and (2.4), (2.5), that 


Now we return to the proof of Theorem 4. We shall prove the validity 
of (2.9). To do this we fix the geodesic and the point 2 on it. If now we 
consider a neighborhood of xz, and normal coérdinates with this point as origin, 
we may apply the lemma to the scalar f in this neighborhood. In particular, 
from (2.14) we infer that 


| (s)| S (S| .a,(0) |?}* (n =1,2,- °°). 
But 
= {> | |?}* 


because of (2.3). Hence | F'™(s)| = D,(f,v). Thus the proof is complete. 


Remark. In Theorems 3 and 4 we may replace the assumption that 2 is 
an interior point of R by the requirement that it be a boundary point of F 
which is the vertex of a conical sub-region of R filled out entirely by geodesic 
arcs issuing from 2, the condition b) being in this case replaced by the 
condition 


b’) lim 2) = 0 as Zp along a path in R. 


3. In this section we shall obtain three theorems in which restrictions 


ns 


SOME THEOREMS ON QUASI-ANALYTICITY. 311 


are placed, not on the quantities Dn(f,z) themselves, but on the integrals of 
these expressions. 


~ 


THEOREM 5. If f is defined in a connected domain K in Vz, and Dn(f, x) 
has the meaning ascribed to it in either of the two preceding theorems, then 
f vanishes identically provided that conditions a)-c) below are fulfilled. 


b) f=0O in a neighborhood of a point xz in R 
C) = 00, 
(m_, is a positive constant, and dv = Vg dz: - - dz is the volume element 
in Vx). 
Proof. Consider the non-negative function 
= 1 


From a) we see that 
® dv = 2. 


R 

Let S, be a geodesic sphere with center at 2, lying in R, and entirely covered 
by a system of normal coérdinates with origin at 2. Since in such a sphere 
Vg =c> 0 we conclude that the integral 


f ® dy’ - dy* 
So 


is finite, y* being normal codrdinates. Now we may evaluate this integral by 
polar codrdinates, 


So Wk 0 


where 7, is the radius of So, and W;, dw denote the surface of the k-dimensional 
unit sphere and its surface element, respectively. By Fubini’s theorem we 


conclude that 
sk-1@ ds 
0 


is finite for almost all geodesics issuing from 2. But f, and therefore ©, 
Vanishes in the neighborhood of 2. Consequently the integral remains finite 
with the factor s* of the integrand suppressed. 

It results from these considerations that if we define F'(s) as in the pre- 
ceding work then since | F™(s)|< D,(f, 2) and F™(0) =0 we have 


) 

y 
1, 
is 
R 
he 


312 S. BOCHNER AND A. E. TAYLOR. 


! 7 i + 
| | | Dasa (f, x) ds 

| ( 

| (s)| = C*m, 
where C is a constant depending on f and the particular geodesic. By 
Theorem D, F(s) must be zero, so that f vanishes along almost all geodesics 
through z in S,. It then vanishes identically, for it is continuous. The con- 
clusion of the theorem is then reached as before. 

The stringency of condition b) can be removed if we alter c) somewhat. 

We shall do this first for Euclidean space. 


THEOREM 6, Let f be defined throughout E;, and let xo be an arbitrary 
point. Then the conditions 


a) f, Di(f, x2) dv S my 
Ex; 
b) Dna(f, 2%) 


ce) (mate mayer)" = 
imply that f = 0. 


Proof. Let 


(2) 5 2). 


@dv 
Ex 


If we adopt spherical codrdinates about the point z, and denote by s the dis- 
tance measured along a half-line issuing from zo, then we conclude by Fubini’s 
theorem, in a manner similar to that of the previous proof, that for almost all 
half-lines emanating from 2 the integral 


f sk1@ ds 
0 


is finite, and therefore, for some constant A depending on f and the particular 


Then by a) 


line 
f 1D, (f,0(8))ds 
0 


Now, in the notation used in proving Theorem 1, F(s) =f(z(s)) and 
| F™(s)| = Dna(f,2(s)). Thus 


0 


ll 


SOME THEOREMS ON QUASI-ANALYTICITY. 313 
(3. 2) (0) = 0. 


It will be proved in the lemma which follows that (3.1) and (3.2), together 
with condition c) imply that F(s) =0 (0s < o). Since f is continuous 
in E;,, this implies that f 0, thus completing the proof. We turn now to 
the lemma. 


LemMMA 2. Let F be a real function of class C® on the infinite interval 
(0, 0), and let the relations (3.1), (3.2) above, and condition c) of Theorem 
6 be fulfilled for an integrk=1. Then F=0. 


Proof. F” is absolutely integrable, by (3.1). Therefore, for complex z 
with positive real part (R(z) > 0) the integral 


(3.3) H(2) (s)as 


exists and defines an analytic function. Integrating by parts repeatedly and 
using (3.2) we obtain 


2"H (z) (5) ds. 
0 
Now we write H,(z) = z"H(z). Then 


dH, (2) 


dzk- 1 


= (— gk-1e-28 (n) (s)ds 
0 

and if R(z) > 0 we infer from (3.1) that 


dH, (2) | 


(3. 4) dzk- i Mn. 
For a fixed z, R(z) > 0, we define 
(3. 5) o(t) = Hn((1— t)z+ 2) 


Integration by parts, applied k — 2 times to the equation 


— 40) # (Hat 


yields 
6(0) = 56 (1) + 
Hence 
(3.6) H, (2) = Pale) + (tat 


where 


or 


y 
_| 
d 


314 S. BOCHNER AND A. E. TAYLOR. 


(3. 7) P,(z) = (1) 


and 

Combining (3.4) and (3.6) we find 
1 


A™mn. 


Pn(z) as a function of n is a polynomial of degree =[k—2. Hence, 
differencing with respect to n, 


(3. 9) an (n=k—1,k,° °°). 
Now 


) Fan(s) = (2) 
= 
Thus, combining (3.8), (3.9), (3.10) 


2—1 |* 


k—1\| 
n-k+1 k-1 <= n-v 
From this inequality we obtain, making the unimportant assumption that 
A =1, the inequalities 


It then follows by condition c) and a lemma of Carleman °® (which he uses for 
the proof of Theorem D) that the analytic function H(z) is identically zero. 
Then F = 0 also, as was to be proved. 

Before going on to the next theorem we shall prove another lemma. 


Lemma 3. Let f(x) be of class C® on (0,1), and suppose that 


1 
a) f™ (a) | < Anm, 
0 


b) f™(0) =0 
ec) 


°T. Carleman, op. cit., p. 20, Corollary. The statement is that if ®(z) is analytic 
in the half plane R(z) 2a20, and if | ®(z)2\n| where0 <A, <A, <h, 
<-.-70 and f, > 0, then the divergence of the series 
> An 
B, 
implies = 0. 


at 


or 


SOME THEOREMS ON QUASI-ANALYTICITY. 315 


where k is an integer = 1 and A is a constant (which we may assume to be 
=1). Then f=0. 


Proof. We shall reduce the problem to that of Lemma 2. Putting 


t 
= 
we calculate the derivatives of g(t) : 
(3.12) fm) (2) 
+ 1) 

where the @n,v are constants, with dnn 1. Thus, using a) 

COL < An 
(3, 13) > | dn,» | my. 


If we differentiate (3.12) we obtain the recursion formula 
(3. 14) Ons1,v41 = An,v — + v+ 1) 
which is correct in all cases if we define dno = 0, dnv—O0if y>n. Therefore 
(3. 15) | | S| | + | dn ver |. 

Consider now the expression 

Pu(é) = €(E+2) (E+ 2n—2) -> bn 

P,(€) is a polynomial of degree n in é Clearly bay = 0 and 
(3. 16) (E+ 2n)" 
It is easily verified that bn,» = 1 and that 
(3.17) Dnsi,ver = + 


where we again adopt the conventions bn o—=0, bnv=0 if v>n. Since 
4h; = b,, = 1 a comparison of (3.15) and (3.17) shows that 


(3. 18) | An,v | Dn,v. 
Now let us write %, == m,1/"._ By c) the a’s have the property 


(3.19) 


e, 

0. 

ic 

My 


316 S. BOCHNER AND A. E. TAYLOR. 


From (3.13), (3.18), (3.19) and (3.16) we conclude 


(3. 20) < A" (an + 2n)", 


Finally, we set 


h(t) 9(t) 


Then 
and so 
n (n-v) 


From this result and (3.19), (3.20) we get 
* 
J | (t)| dt A"(a, + 3n 
0 
or, for a suitable constant B (depending only on A and k), 


(3. 21) h(t) | < + 
0 


Since h'") (0) = 0 the conclusion h(t) =0 will follow from Lemma 2 if we 
can prove that, writing Bn = a, + 4n, we have 


(3. 22) 2 (Bn™ + — 

Now fi Therefore the series in (3.22) will 
diverge if 

(3. 23) = 0, 


In proving this we shall distinguish two cases. Suppose first that for all 
sufficiently large values of n, a, = n, Bn 5%. In this case the series (3. 23) 
will diverge if 

D 0, 


But this is true because of condition d) of the lemma and the definition of @ 
The alternative situation is that in which there exists a strictly increasing 
sequence {nv} of integers such that ¢n, << mv (v—=1,2,---). Thus we have 


(3. 24) Bn = if Ny. 


Now with the omission of at most some early terms the series (3. 23) may be 
written 


ll 


ne 


ve 


SOME THEOREMS ON QUASI-ANALYTICITY. 317 


n=ny-y+1 


and so, by (3. 24), it diverges if the series 
(3. 25) > (ny) 


is divergent. 

In proving the divergence of (3.25) we may obviously assume that nv 
tends to infinity as rapidly as we please, for this reduces the size of the partial 
sums. In particular if we can show that for some constant € > 0 and each 
integer m = k it is possible to determine an integer N = N(m) so large that 


N 
> N-n/n-k+1 = E, 


n=m 


then we see that with mv_, chosen, ny may be selected so as to contribute at 
least €) to the sum of the series, which must then diverge. Now for fixed m > k 


m 


lim >} N-"/"-*1 — (0 as 


n=k 


Hence we need only prove that 


N 
(3. 26) lim inf N-*/"-*1 0 as 
n=k 
Now 


N Nt1 
—n/n—-k+1 > 7-t/t-k+1 
SN N dt. 
k 


n=k 


Making the change of variable u= log N/t—k-4+ 1 we obtain, as soon as 
log N > 1, 


k+l > log N (k-1)u (a = log N) 
log N (k-1 7 ¢ 
— 2) fog N — 1] 
and so 
N 
lim inf N-n/n-k+1 e7(k-1) > 0) as N 
n=k 


This completes the proof of the lemma. It is to be used in the proof of 
the following theorem. 


Turorem 7%, Let f be defined in a connected domain R in Vy. Then tf 
% isa point in R and Da(f,x) has the significance ascribed to it in either 


Theorem 3 or Theorem 4, the conditions 


| 
) 
ig 
= 


318 8. BOCHNER AND A. E. TAYLOR. 


a) Jf, my 
b) Dn(f,%o) = 0 


together with conditions c), d) of Lemma 3, imply that f =0. 


Proof. We may show, by considerations similar to those involved in the 
proof of Theorem 5, that for a geodesic sphere with center at zx, and radius 1, 
the inequalities 


| F™ (s)| sds < 
0 


are satisfied for almost all geodesics through 2. By a trivial transformation 
we may assume that r,>—=1. Then since F')(0) —0 the previous lemma 
assures us that F'(s) = 0 for almost all geodesics. The remainder of the proof 


then goes as before. 
Part II. 


4. In this section we shall present two theorems about functions defined 
in the whole Euclidean space H,. They differ from the theorems of Part I in 
two respects. Instead of the quantities D,(f,2) defined by (1.1) we consider 
differential operators applied to f. These need not involve all the partial 
derivatives of f. The second difference lies in the fact that we replace the 
single point Z by a certain type of point set. For this we lay down the fol- 
lowing definition : 


DEFINITION. A point set U contained ina region R of EF, will be called 
a set of analytic determination, or simply a set of determination, if a function 
¢ which is analytic in R and which has the value zero at all points of U must 
vanish identically in R. 


For the first theorem the differential operators are defined in the following 
manner. Let p be a fixed positive integer. For each positive integer n let 
Pn(é,* * *,€) be a homogeneous polynomial in €,,- - -, & of degree np, with 
real coefficients. We define Po(é:,° = 1. Then if f and all its deriva- 
tives are defined in EF, the differential operators 


0 0 
Py 


have an obvious meaning when applied to f, the resulting being a function 
defined in F;,. We use the notation 


Or,’ 


SOME THEOREMS ON QUASI-ANALYTICITY. 319 


THEOREM 8. Let f be defined in E,. With a certain fixed Cartesian 
coordinate system let f(x) = 0 except when 2; > 0 (i=1,---,k), 
and let U be a set of determination contained in the region 2; > 0 (i=1,---, k). 
Then if 

a) | Ln(f,%)| S mn, x in By 
b) S| = 0, in U 


the function f vanishes identically. 


Proof. We shall first carry out the proof under the additional assumption 
that all the partial derivatives of f are bounded in F,. (f itself is bounded, 
because of a). 

We form the function 


oO 
0 


Since f is bounded F is analytic if R(zi:) >0, i=1,---,k. Under our 
temporary assumption we may integrate by parts; since f(2) =0 if any 7; =0 
(and similarly for the derivatives) we obtain, for any non-negative integers 


co oo 
P,(z) F(z) e~ (att... tented t) dt, dt, 
0 0 


and if R(z;) = 2; > 0 it follows that 


> Mn 
| Pa(2)F(2)| 


* 


Now let (é,,: - -,é) be any point in U, and place zi = é:w, where R(w) > 0. 
Then by the homogeneity of P, we obtain 


Mn 


»&w)|S & 
For fixed £ define 6(w) = F(é,w,: - +, &w) -®(w) is analytic if R(w) > 0. If 


| (E) (Ew, 


R(w){é.° 
we have 


But by condition b) it follows from a result of Carleman already referred to,” 


*Cf. footnote 6. 


1a 
in 
er 
al 
he 
1g 
et 
th 
a- 


320 S. BOCHNER AND A. E. TAYLOR. 


that ®(w) =0. In this way we see that the analytic function F vanishes at 
all points of U, and therefore =0. This implies that f= 0. 

We have now to free ourselves of the assumption of boundedness of the 
partial derivatives of f. We do this by approximating the function f by a 
function whose derivatives are bounded. 

By a general theorem * it is known that if K is an absolutely integrable 
function defined over E;,, with 
(4. 3) K (a1,° +, 

k 


and if f is bounded and continuous, then for the function (A > 0) 


we Lave i 
(4. 5) lim f,(z) = f(z) as A> o. 


We shall require of K that it be non-negative, possess derivatives of all 
orders, and that it vanish outside a fixed, finite hypercube in H;,. Then it is 
permissible to differentiate under the integral in (4. 4a), and we obtain 


(4. 6) In Ls K(t)dt 


where we are using an obvious abbreviation in notation. Consequently, by 
condition a), together with (4.3) and the fact that K is non-negative, we 
conclude 

| Ln(fr,€)| S mn. 


Since K vanishes outside a finite region in H,, and f vanishes except when 
x; > 0, we easily infer from (4.4) that for any fixed A > 0, fy vanishes except 
when 2; > b;, where b,,- - -, 6, are constants depending on A. Finally, the 
partial derivatives of f, are bounded in H,. To see this we make use of (4. 4b): 

r Ot," eee 


Since the derivatives of K vanish outside a finite region, and f is bounded, the 
result follows. 


®S. Bochner, Vorlesungen iiber Fouriersche Integrale, Leipzig (1932), p. 191. 


E 


at 


the 


ble 


dt, 


all 
t is 


by 


we 


the 


SOME THEOREMS ON QUASI-ANALYTICITY. 321 


But now we can conclude from the first part of the proof, after a trivial 
change of coordinates, that f,==0. By (4.5) it follows that f= 0. The proof 
is thus complete. 

For the next theorem we introduce the Laplace operator and its iterates. 


ef 


Af (x) + + 
A"f(x) = A(A""f(z)). 
As usual we adopt the convention A°f = f. 


THEOREM 9. Let f be defined in E;,, and let U be a set of determination 
in Ey. Then the conditions 


a) | A"f(x)| S mn, in 
b) A*f(é) =0, in U 


imply that f = 0. 


Proof. We again make the temporary assumption that the partial deriva- 
tives of f are bounded in Ey. For « = 0 we define 


(4.%a) 

Ex 

(4.%b) , x, &) 

Ex 
where = is a normalizing factor. %) =F (a2, a) may 
be regarded as a transform of f(x). If we write 
F(a, a) Ta(f; 


then because of the formula 


ow f e (t+... +t?) dt; = 1 
Ex 


it is clear that To(f,7) =f(x) and 


(4.8) | Ta(f,2)| max | 


| | 

| 

= 
hen 
ept 

the 
b): 
di 


322 S. BOCHNER AND A. E. TAYLOR. 


It is well known that ° 
(4. 9) lim F(z,a) =f(r) as a— oo 


the convergence being uniform in any finite portion of Ey. 

Under our temporary assumption we may calculate the derivative of 
F(z, %) with respect to « by differentiating under the integral in (4.%a). If 
we do this, and then integrate the resulting equation by parts we obtain 


(a, a) 


T.(Af, x). 
Consequently 
O"F' (ax, a 
(4. 10) —T,(A"f, 2). 


Therefore, by (4.8) and condition a) 


=| |=m,, in «20 


| O"F (x, a) | 
dan 


while by condition b) 


dan 


= A"f (é) 0, é in U. 


By Theorem D it follows that F(é,«) =0 if a2O0 and é is in U. But for 
fixed « > 0, F(a, «) is analytic in the entire space H;. This is proved '° by 
making use of (4.7%b). Consequently, because of the nature of the set U, 
F(z,«%) =0 for each z in E;, if « >0. By (4.9) we conclude f=0. 

Without the assumption of boundedness of the partial derivatives of f the 
whole difficulty lies in the establishment of (4.10). In order to meet this 
difficulty we define f,(~) by (4.4), where K has the same properties as were 
required before. Then we know that the partial derivatives of f, (A > 0) with 
respect to the 2’s are bounded in F,, and that 


(4. 11) | A"f,(x)| S mn, in Ey, X> 0. 
From (4.6) it follows that 


(4. 12) lim A"f,(z) = A"f(z) as A> 


° This is in fact a special case of the theorem referred to in footnote 8. For the 
case k =1 see E. Borel, Legons sur les Fonctions de Variables Reélles, Paris (1928), 
pp. 52-53. 

10The method used by Borel, op. cit., pp. 53-54 for the case k =1 works equally 
well in general. It consists in defining the integral for complex values of the variables 
and showing that the resulting function is analytic. 


1 
U 
8 

8 
( 


SOME THEOREMS ON QUASI-ANALYTICITY. 323 


and it is easily verified that the convergence is uniform over each finite position 
of Ex. Now by the previous work, if 


then 
(x, @) 


(4. 13) 


= T(A*f), 2). 


It is readily proved, using condition a) and (4.11), (4.12), that 
lim T,(A"f,, 7) = T,(A"f,z) as A> ©. 


This follows from (4. 7b), together with the fact that f, f, are bounded, and 
that f, > f uniformly over each finite portion of E;,. 

The passage to the limit A— o in (4.13) then establishes (4.10) and 
completes the proof. 


5. In this concluding section we are concerned with the sphere 
Ver = 1 (k = 2) 


in Ex,,. It is an analytic Riemannian space of constant positive curvature, 
and as such the Laplace operator A is defined for it.” 

We may define a set of determination U on V; in the same way as it was 
done in § 4. A function ¢ is analytic over a portion R of V; if it is analytic 
in each of the allowable codrdinate systems covering R (any two codrdinate 
systems being connected by analytic transformations). 

Points of Vy, will be denoted by P,Q,--:-. 


THEOREM 10. Let f be defined in Vi (k 22), and let U be a set of 
determination in V;. Then the conditions 


a) | A"f(P)| Simi, P in Vy 
b) A*f(P,) =0, Po in U 

imply that f = 0. 


The proof depends upon the properties of certain expansions in series of 
ultra-spherical functions. These expansions reduce to the ordinary Laplace 
series of spherical harmonies when k = 2. 


™ The invariant form of the operator is 
Af = gtif 
See 0. Veblen, Invariants of Quadratic Differential Forms, Cambridge University Press 
(1927), p. 63. 


= 


324 S. BOCHNER AND A. E. TAYLOR. 


We shall introduce codrdinates in V; by using spherical codrdinates in 
We put 
cos 6; 
= sin 6, cos 6, 


Ley = 7 sin sin cos 
6,- - sin 6. 


Then it may be found that if '* 
12See E. Heine, Handbuch der Kugelfunktionen, Band I, 2nd ed., Berlin (1878), 
pp. 460-461. 
v = sink! 6, sin*-? sin 


2 


Uy, 
= sin? 6, 
Uy = 7? sin? 6,- sin? 
and F is defined in Fy,,, then 
ér\1 Or 06, 06; 06; \ ux 


Hence we have, for a function f defined in V; 


0 of 
k-1 
2 sink 6, sin* 6; 00; (sin 00; iz) 


where w, = 1, w; = (sin 0,- - - sin 6;_,)-7, 2,: - -,k. 


Now we put 
(5. 3) v= (k—1)/2 


and consider the polynomials P,”)(x) defined by the expansion ** 
co 

(5. 4) (1 — 2ra = (2). 
0 


For k = 2, v= 1/2 and P,”)(z) is the Legendre polynomial of order n. It 
is known that if P, Q are two points on the sphere V; and y is the angle 
(0 <y <7) between the lines joining these points to the center of the sphere 
(so that y is the geodesic distance from P to Q), then for the function 


(5. 5) F,(P, Q) = Pa (cos y) 


*8 A systematic study of the functions P, (”) (2) may be found in N. Nielsen, Théorie 
des Fonctions Metasphériques, Paris (1911), especially Chapter VII. The expansion 
(5.4) is given on p. 98. The notation of Heine, op. cit., pp. 451-460, is slightly different 
from that of Nielsen. 


E 


SOME THEOREMS ON QUASI-ANALYTICITY. 325 


we have 74 
(5. 6) AF, = — n(n + 2v) Fn. 


Since F’, is symmetric in P and Q it does not matter which point is regarded as 
variable in evaluating the left member of (5.6). 

As a consequence of (5.6) and Green’s theorem we also have the im- 
portant relation 


(5.7) f Fa(P, 2») f Fa(P, Q)F(Q)40 


where dQ denotes the volume element in the space V,. 
With a function f defined in V; we associate the series 1° 


(5.8) ~ f) 
where 
(5.9) an(P, f) — (P, 


Concerning this series we assert: 
Lemma 4. Jf f is continuous on and H,”)(P) denotes the Cesaro 
means of order p of the series (5.8), then 


lim H,® (P) =f(P) if p>v 


| (P)| S max |f| if p= 


Proof. In proving the first assertion we may assume that the codrdinates 
(5.1) are chosen so that P is the point 7; 1, 7—0,1>1. Then y—9@, 
and 


Vi 0 0 


‘sin and Q has the codrdinates (6,,- - ). From 


this result and the formula *° 


where w = sin*-? 6, - - 


> 


T'(n + 2v) 
T(n + 1)T(2v) 


P, (1) 


“ Heine, op. cit., p. 461. 

* This series arises naturally through the expansion of the integrand in Poisson’s 
integral when the latter is used to solve the Dirichlet problem for the sphere V 

* Nielsen, op. cit., p. 95. 


‘ 

q 

t 


326 S. BOCHNER AND A. E. TAYLOR. 


it is seen that if we put 


then the series (5.8) becomes the formal expansion of ®(cos 6) in a series of 
the form 


(5. 11) ®(cos 0) ~ > cnPn™ (cos 6) 
0 
for the point 6 = 0, the coefficients c, being given by the formula *” 


(cos #,) is continuous, since f is continuous; also ®(1) = f(P), since P has 
codrdinates 6, = 0, 6; arbitrary if 1=2. Now it has been proved by Kog- 
betlianz ** that the series (5.11) is uniformly summable (C, p) to ® if p>y 
and the function is continuous. This proves the first assertion of the lemma. 
In connection with the second part of the lemma we consider the series 


(5. 13) (n + v) Pa (cos 6). 


If sn‘? (cos ) denotes the Cesaro means of order p of this series we see from 
(5.5) and (5.9) that 


(5. 14) (P) — (c08 7) f(Q) a0. 


In particular, for f = 1 we find 


(5. 15) (cos y) dQ. 
Vi 


Now Kogbetlianz has proved *® that s,”)(cosy) 20 if pS2v+1 and 
0yX7. The second assertion of the lemma now follows at once from 
(5.14) and (5.15). 

We now turn to the proof of Theorem 10. We begin with the series 


(5. 16) F(P,A) = (P,f) A> 0 
0 


17 Ervand Kogbetlianz, “ Sommabilité des Séries Ultrasphériques,” Journal de Mathé- 
matiques, vol. 3 (1924), p. 110. 

18 Kogbetlianz, loc. cit., p. 118, p. 168. 

1° Kogbetlianz, loc. cit., p. 179. 


0 
‘ 
| 


12” 


SOME THEOREMS ON QUASI-ANALYTICITY. 327 


obtained from (5.8) by introduction of exponential factors. Concerning this 
function we make the following statements. 


Lemma 5. For each P in V; the function F(P,2) defined by (5.16) is 
continuous, with continuous derivatives of all orders on the open interval 
0<1< Moreover 


and 
(5. 18) | max | arf 0<AS1 


where A 1s a constant independent of r, n, and P. 

Granting for a moment the truth of the lemma we can easily deduce from 
it a proof of Theorem 10. For from the lemma and the conditions of Theorem 
10 we conclude by Theorem D that F(Po,A) =0 if Py is in U. Therefore 
n(Po,f) = 0 if Po isin U. But P,) (cosy) is a polynomial in cos y, and 
cosy is a polynomial in sin 6;, cos 6;, sin B;, cos Bi, t= 1,:--,k, where & 
refers to Q and #; to P in (5.5). Thus an(P,f) is a function of P is analytic 
on Vx, and since it vanishes on U it vanishes identically. Then by Lemma 4, 
f(P) =0. 

It remains only to prove Lemma 5. The series (5.16) is a power series 
inte. Since (5.8) is summable (C,p) if p >v we know that #1 

lim an(P,f) __ 


nP 


0 p>v 


and from this it follows that (5.16) is convergent for all positive values of A 
(i.e. for OS¢< 1). We may obviously differentiate term by term with 
respect to A. Doing this we obtain 


AF (P, 


00 
= n(n (P, f) 
Or 0 


= Af) 
0 
because of (5.7). Thus in general 


(5.19) Amp), A> 0. 
n=0 


*° For the expression of cosy, see Heine, op. cit., p. 458, and the second line on 
p. 461. 

** See for example Tomlinson Fort, Infinite Series, Oxford (1930), p. 210, Theorem 
210, 


f 
v 


328 S. BOCHNER AND A. E. TAYLOR. 


~ 


From this it follows that in the proof of Lemma 5 we may restrict our 


attention to the case n = 0. 
If we reorder the series (5.16) according to the rule 


00 ee 
Andy — (do + + An) (bn 
0 0 


applied p + 1 times, where we put bn = e*"™?”), we obtain 


n=0 


where Dn°bn = bn, = — On, and = The justi- 
fication of this rearrangement is easily made because of the speed with which 
b, and as n—> 

Now we put p=2v+1—k and M=max|f|. Then by Lemma 4 


| F(P,A)| <u> (" 4 | 
0 
and it suffices, to establish (5.18), to show that for some constant A we have 


CO fn k 
(5. 21) K(X) = ( ke ) | | = A As 1. 
0 


We write 
Dn 
Then 


1 1 
0 0 
Since 


(" if nZ1 


we see that (5.21) will hold if the series 


is bounded for 0 <A=1. But because of the equation 
fo 1 n+1 oo 


the above series is equal to 


1 
T 
ig 


ve 


SOME THEOREMS ON QUASI-ANALYTICITY. 


a oO 
dt. | t,* | p**1(t,) | dt, tk | | dt. 
0 0 x 


It is therefore sufficient to show that this integral is bounded for 0<A1. 
We set and write 


= 9(t) =e 
Then 
(t) — /2y (ket) (s) 


and therefore, with s; = A/*(1-+ v), 


This integral is less than an integral of the form 


f Q (A, s)e-*ds 


where Q(A,x) is a polynomial in s and A*, This is clearly bounded for 
0SASZ1. Thus (5.21) and (5.18) are established. 

We have finally to prove (5.17). Since if f==1 we have H,™(P)=1 
and F(P,») =1, we have from (5. 20) 


Therefore in general 
n+k 
F(P,A) —f(P) = (— (P) — f(P) Dal bn. 
0 C 
Now for a fixed P in V;, and a given € > 0 we can choose N = N(€) so that 
| H,”(P)—f(P)|<€ if n=N. 


Since by, = e"("+2”) jt ig clear that D,*#b,—20 as A—0 for a fixed n. 
Therefore, from (5.21) it follows that 


lim | F(P,A) —f(P)|S€A as A> 0. 


This proves the validity of (5.17), since € was arbitrary. Thus theorem 10 
is completely proved. 


PRINCETON UNIVERSITY. 


329 

h 

6 


PROBLEMS OF THE CALCULUS OF VARIATIONS WITH 
PRESCRIBED TRANSVERSALITY CONDITIONS.* 


By D. MANCILL. 


Problems of the calculus of variations in (2, y:, y2,° * *, Yn)-space for 
which a prescribed relation exists between the directions of the extremals and 
the transversal directions were first studied by Rawles.t_ More recently, La 
Paz,’ using a method and point of view quite different from that of Rawles, 
has given a rather complete treatment of the problem in non-parametric form. 

In the present paper we use a method similar to that of La Paz but 
avoid his very intricate treatment of an associated system of non-homogeneous 
partial differential equations by reducing the problem to a very simple total 
differential equation. The method applies with equal facility to parametric 
and non-parametric problems of the calculus of variations in space of any 
number of dimensions. We shall consider the problem in parametric form 
and derive necessary and sufficient conditions in order that a transversality 
relation belong to a problem of the calculus of variations. Finally, we shall 
obtain the most general integrand function of a problem of the calculus of 
variations to which a given transversality relation belongs. 


We shall consider the integral 


(1) [= F(z, x’) dt, 

ty 
where the integrand function F(2,- 2'n) =F (za, 2’) satisfies 
the usual continuity and homogeneity properties in a fundamental region FR’? 
For a problem of minimizing the integral (1), an extremal # through the 
point (z) in the direction (2z’) is cut transversally by the hyper-plane of 
directions (z’) defined through this point by the equation 


* Received May 11, 1938. 

1 Transactions of the American Mathematical Society, vol. 30 (1928), pp. 765-784. 
For the earlier writers on this problem in the plane case, see Stromquest, 7'ransactions 
of the American Mathematical Society, vol. 7 (1906), p. 181; and Bliss, Annals of 
Mathematics (2), vol. 9 (1907), p. 134. 

2 Bulletin of the American Mathematical Society, vol. 36 (1930), pp. 674-680. 

®In this connection, see Bolza, Lectures on the Calculus of Variations, University 
of Chicago, pp. 17-21. In this paper we shall assume that the function F is of class €”’ 
in a region R and all (a) ~ (0). For a definition of the term class as here used see 


Bolza, loc. cit., p. 116. 
330 


PROBLEMS OF THE CALCULUS OF VARIATIONS. 331 
(2) pF or, (2, = 0. 


Here, as elsewhere in this paper, a repeated Greek letter is an umbral index 
indicating a summation with the range 1 to nm unless otherwise specified. 
This definition of transversality simply defines what is meant by transversality 
for every element (2, z’) in the fundamental region R’ of the integral (1). 
We shall restrict our attention to that portion of the region R’ in which 
the function F(x,x’) is different from zero. In this subregion of R’ the 
functions 
(3) t; = Fy, (2, x’) (¢=1,2,---,n) 


are of class C” and not all zero for any element (z,2’), as follows from the 
homogeneity property 


(4) (x, x’) = F(a, 2’). 
Hence, in the subregion specified, we have 
(5) Fo, (2, 2’) 0. 


We shall say that the functions (3) define the transversality relation (2) 
and that this transversality belongs to the calculus of variations problem (1). 

The condition (2) establishes a certain relation between the element 
(t,2’) through the point (2) and the hyper-plane of directions (z’) whose 
normal has the direction (¢) through the point. Ifn given functions T;(z, 2’), 


1=1,2,---,mn, of class C” in a region S’ are to define a transversality 
(6) u(x, 2’) = 0 


belonging to a problem (1) with fundamental region S’ in which the integrand 
function is not zero, then the equation (6) and the equation (2) define the 
same hyper-plane of directions (Z’) for every element (z,2’) in 8S’. That is, 
there must exist functions F(z,2’) and K(z,2’) different from zero in S’ 
and satisfying the system of equations 


(7) Fy 2’) = K (2, 2’)T; (2, 2’) (t=1,2,:°-,n) 


and the homogeneity condition (4). It is easily verified that in order that 
the homogeneity property (4) be satisfied by the function F(z, x’), we must 
have 

(8) K (a2, 2’) = F(a, 2’) 2’), 


since x’, 7’, +4 0 as follows from the relations (4) and (7%). Thus it follows 
from (7) that 


(9) Fy,(a, x’) = F(x, 2’)Ti (2,2) 2’) (t= 1,2," +, 0). 


or 
nd 
4a 
m. 
ut 
us 
al 
ny | 
ty 
all 
of 
ies 
he 
ot 
84. 
ons 
of 
sity 
gee 


332 JULIAN D. MANCILL. 


If these partial derivatives be substituted in 
dF = Fy, (x, x’) 


where F is regarded as a function of the n variables (x’) and the n parameters 
(x), we obtain the total differential equation 


( 10) dF /F => pT 


The left member of this equation is the exact differential of the function log F. 
It is found that necessary conditions for the right member to be the differential 
of a function H(z, z’) considered as a function of the variables (z’), are that 
the functions T; satisfy the n(n —1)/2 relations 


where m,k —1,2,:--,n,m<k. These conditions are also sufficient if 
suitable connectivity properties are assumed for the region §’. An easy 
calculation shows that the expressions Ryn, can be expressed as follows: 


Tm (OT — OT ./0x’v) + (OT OT y/02'm) 
+. — n/02')] = 0, 


where v is an umbral index with range 1 to n excluding m and k. 
If the conditions (11) are satisfied in S’ then there exists a function 


(12) F(a, = G(x) 


satisfying the equations (7) with K(2,2’) defined by equation (8). Then 
the functions T; satisfy the n(n — 1) (n — 2)/6 equations 


(13) — OT. (02's) + — OT 


where the suffixes r, s, and ¢ represent all possible combinations of three suffixes 
from among 1,2,---,n.* It can be shown that only (n—1)(n—2)/* 
of these conditions are independent, one set of which may be obtained by 
fixing one of the suffixes and letting the other two represent all possible 
combinations of two suffixes from among the remaining n—1 numbers. 
Therefore, if the equations (11) are satisfied so are the equations 


(14) T' — = 0, 


‘Forsyth, Differential Equations, Macmillan and Co. (1888), pp. 261-262. 


t 

i 
| 
WI! 
8a 
| 
(1 
wh 
| 
\ 


if 


PROBLEMS OF THE CALCULUS OF VARIATIONS. 333 


where m,k =1,2,---,n,m<k. Conversely, if the equations (13) and (14) 
are satisfied so are equations (11). Consequently, the two sets of conditions 
(13) and (14) are equivalent to the set (11). 

Now, the set of equations (14) will be satisfied if and only if the n 
functions 7’; are such that 


= 2’) T; (t= 1,2,---,n), 


that is, if and only if the functions 7; considered as functions of the n 
variables (x’) are solutions of the same partial differential equation 


(15) = h(a, 2’)T. 


To obtain the most general integral of this linear equation we write the 
subsidiary equations 


= > = da'n/2'n = AT /h(a, x’) T 
and obtain n independent integrals of these, namely 
= = An-15 T/U (2, z’) = An, 


where the a; are constant with respect to the variables (2). The last integral 
is obtained by making the substitution 2’; = aiz’n, i= 1,2,° --,n—1, in 
h(z,a’) and integrating the equation 


= dT /T 


and then replacing the a; by their values i=1,2,- +,n—1. Then 
the most general solution of the equation (15) is of the form 


T = U (a, * Mana), Un = n; 
(k = 1, 2,: * 


where W is an arbitrary function of its arguments. Conversely, the function 


where R and W are arbitrary functions of their arguments is a solution of the 
equation (15) for h=a,ha,/R. Therefore, the conditions (14) will be 
satisfied if and only if the functions 7; are of the form 


(16) Ty = R(a, 2’) Ue = 2'r/2'n, 
(k = 1, 


where R and W; are arbitrary functions of their arguments. 


] 
t 
| 
) 
0, 
e8 
/2 
by 
Ts. 


334 JULIAN D. MANCILL. 


We may summarize our results in the following theorem: 


THEOREM. If T;(z,2’), 1 =1,2,- -,n, are of class C” in a region 
of (x, x’)-values, then necessary and (if S’ possesses suitably simple con- 
nectivity properties) sufficient conditions that these functions define a trans- 
versality relation (2) for a problem (1) whose integrand F is different from 
zero are that these functions satisfy the inequality (2, x’) A 0, be of the 
form (16), and satisfy (n —1)(n—2)/2 independent relations of the form 
(13). Whenever these conditions are satisfied, the corresponding integrand 
function is of class C’’’ and of general form (12), in which H(z, 2’) is defined 
by the line integral 


where (xo) is any fixed admissible element of 8S’, and G(x) is different from 
zero and of class C’” but is otherwise an arbitrary function of its arguments. 


The fact that the integrand function F as defined in the Theorem satisfies 


the relation 
F(a, ka’) = kF (2, x’) (k > 0) 


is easily seen by making the substitution 2’; = in (10) 
and making use of the property (16) of the functions Tj. 

As an illustration of the Theorem let us obtain the most general problem 
(1) for which transversality is equivalent to orthogonality. In this case 
T; =2';,i=1,2,- + -,n, which satisfy all the conditions of the Theorem for 
all (z’) (0). The function H(z, 2’) takes the form 


H (a, x’) = (1/2) log 


if the lower limit in the integral is properly chosen. Therefore, the integrand 


function F(z, x’) is of the form 


F(a, 2’) = G(a) 


UNIVERSITY OF ALABAMA. 


® Mancill, Bulletin of the American Mathematical Society, vol. 43 (1937), Pp. 30, 
Abstract 41. 


( 
t 
| 
i 
0. 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF 
CHARACTERISTIC ZERO.* 


By JoHN WILLIAMSON. 


Introduction. A matrix A with elements in the complex number field is 
said to be a normal matrix, if 4A* — A*A, where A* is the conjugate trans- 
posed of the matrix A. A necessary and sufficient condition that A be a 
normal matrix is that there exist a unitary matrix U, such that U*AU = D, 
where D is a diagonal matrix. Further a matrix A is normal, if, and only 
if, A* = f(A), where f(x) is a polynomial in z._ From this it is possible 
to make a satisfactory definition of normality with respect to any non-singular 
hermitian matrix H. A matrix A is said to be normal with respect to the 
hermitian matrix 7, if AH = Hf(A*). The simplicity of the canonical form 
of a normal matrix A under unitary transformation suggests the following 
problems. What are the possible canonical forms for a matrix A, normal with 
respect to H, under similarity transformations by matrices which are con- 
junctive automorphs of H? What are necessary and sufficient conditions that 
two matrices, both normal with respect to H, be similar under transformations 
by matrices, which are conjunctive automorphs of H. These problems were 
discussed in a previous paper.’ It is our intention here to consider the corre- 
spondir problems where the matrices under consideration are matrices over 
afield  iracteristic zero. 


1.  efinitions. Let K be a field of characteristic zero, over which is 
defined a. automorphism of period one or two. If under this automorphism 
an element a of K corresponds to an element a of K, in both cases @—=a. If 
A= (aij), i= 1,2,- -,m; 7 =1,2,° is a matrix with elements in K, 
the matrix A*, the conjugate transposed of A, is the matrix whose element in 
the i-th row and j-th column is dj;. If the automorphism over K is of period 
one, A* is simply A’ the transposed of A. The matrix A is said to be hermitian 
if A* = A and anti-hermitian if A* = — A. If the automorphism over K is 
of period one, an hermitian matrix is symmetric and an anti-hermitian one, 
skew symmetric. In the sequel all matrices are matrices over the field K. 


* Received July 30, 1938. 
‘John Williamson, “Matrices normal with respect to an hermitian matrix,” 
American Journal of Mathematics, vol. 60 (April, 1938), pp. 355-374. 


335 


} 

d 

d 

e 

or 

|| 


336 JOHN WILLIAMSON. 


Derinition. If H is a non-singular hermitian or anti-hermitian matric, 
so that H* = eH, «= +1, and AH = Hf(A*), where f(z) is a polynomial 
in the ring K[2], the matrix A is normal with respect to H. 


Let A be normal with respect to H, so that 


(1) AH = Hf(A*). 
Then, since H* = eH, 


HA* = f(A)H = Hf{f(A*)} by (1), 

and, since H is non-singular, 
(2) A* f{f(A*)} and A= f{f(A)}. 
On writing B = f(A*), we have as a consequence of (1) and (2) 
(3) AH=HB, B*H = HA*, B* =f(A), A*=f(B), A=f(B*). 

It is easily shown that, if 

PAP“ =A, and PHP* =H,, A,H, = H,f(A*), 

so that A, is normal with respect to H,. 


DEFINITION. Two matrices A, and A, are H-equivalent, if there exists a 
non-singular matric P such that PHP* =H and PA,P* = Az. 
Consequently we have 


Lemma 1. If A, and Az are H-equivalent and A, is normal with 
respect to H, then A, is normal with respect to H and, if A,H = Hf(A,*), 
= Hf(A*,). 


2. Statement of problem. Let A, and Az be two similar matrices. 
Then A, and A; are both similar to the same matrix @Q and there exist two 
non-singular matrices P,; and P2, such that 


(4) A, = (4 = 1,2). 


If A, and A, are both normal with respect to H, by Lemma 1, Q is normal 
with respect to both of the matrices 


(5) S; = (i= 1,2). 


The matrices 8; in (5) are both hermitian, when H is hermitian, and antt- 


| 
| 
| 
| 


nal 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 337 


hermitian, when H is anti-hermitian. The proof of the following fundamental 
theorem is omitted.? 


THEOREM 1. Necessary and sufficient conditions that two matrices, both 
normal with respect to H, be H-equivalent 1s that there exist a non-singular 
matrix C such that 


CQ = VC and CS,C* = &2, 
where Q, S; and 8S, are any matrices, which satisfy (5) and (6). 


The matrix Q is any matrix similar to A, and A, and is therefore at our 
disposal. In order to determine when A, and A, are H-equivalent we need only 
determine when S; is conjunctively equivalent to Sz under transformations by 
matrices commutative with Q. As is usual in such cases, we may treat A, and 
A, separately and reduce 8, (and Sz) to a canonical form under conjunctive 
transformations by matrices commutative with Q. Accordingly we drop the 
suffix one and write A,S for A,,S, respectively. The matrices Q,S thus 
obtained will be called a canonical form for A, H. 


3. Reduction to special cases. The matrix Q is now any matrix similar 
to A and therefore the invariant factors of Q —zF are the same as those of 
A—zH. The powers of the distinct irreducible polynomials, which are 
divisors of the invariant factors of A—zF, will be called the elementary 
factors of A — FE or for brevity the elementary factors of A. Therefore the 
elementary factors of Q are the same as the elementary factors of A. Further, 
if [p(x) ]¢ is an elementary factor of A, we shall call p(x) a characteristic 
factor of A. 

We first choose Q to be the diagonal block matriz, 


where each characteristic factor of Q; has the same value pi(x) and p;i(z) 
pj(x), when Then, if and PHP* = 8S, by Lemma 1, 


(6) QS = Sf(Q*) = SM, 
where M is the diagonal block matrix 


M =[M,, M,,: Mx] 
and 


M; = f(Q*3). 


*See John Williamson; loc. cit., p. 260. 


338 JOHN WILLIAMSON. 


Ii the elementary factors of are We may 
suppose Q; to be in the Wedderburn canonical form,’ so that 
(7) Qi = +, 
where 
(8) = pili; + Vij. 


In (8) 7 is the companion matrix of pi(x), pili; is the direct product of p; 
with the unit matrix of order e;; and U;; the direct product of the unit matrix 
of the same order as p; with the auxiliary unit matrix of order e;;. Then 

Mi = f(Q*:) = = + Vis], (7 = ta), 
where Vi; is nilpotent. If the characteristic equation of Ri; is irreducible, 
the number of elementary factors of M; is the same as that of Qi; otherwise 
the number of elementary factors of M; is more than that of Q;. Since J is 
similar to Q, the number of elementary factors of M is the same as that of Q 
and therefore the characteristic equation of Ai; is irreducible. Hence the 
characteristic factors of M; all have the same value p;(z). Two distinct 
cases arise: 

Case (i). The characteristic factors of M; all have the same value 
and 

Case (ii). The characteristic factors of M; all have the same value p;(z). 


Let 
S == (Gre), (r,s =1,2,---,k), 


be a partition of S similar to that of Q; i.e. Sys is a matrix with the same 
number of rows as Q, and the same number of columns as Qs. Then, as a 
consequence of (6), 

(9) OrSre = (r,s = 1,2,---,k). 


Case (i). Since M; has no characteristic factor in common with @s 
when s~ j, Ssi = 0 when sj. Further, since S = «S*, 
Sis = 0, 
when s j. 
Finally, since S is non-singular, S;; must be square and non-singular and 


so must Sj. Equation (9) with r—1 and s = j, shows that Mj; is similar 0 
Q; and therefore that all characteristic factors of M; have the same value pi(Z)- 


®J. M. Wedderburn, “Lectures on matrices,” Colloquium Publications (1934), 
pp. 123-124. 


| 


1d 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 339 


Hence in case (i), Qi and Q; are of the same order and the exponents of the 
elementary factors of Q; are the same as those of Qj. 


Case (ii). As a consequence of (9), Sis == 0, unless Since S is 
non-singular, S;; is non-singular. 


Accordingly after a rearrangement of the rows of Q, M, and S and the 
same rearrangement of the columns, we see that Q, M and S are similarly 
partitioned diagonal block matrices. The blocks are of two distinct types: 


Type (i). 


[240s], 


Oi M; 0 ); 
0 Q; 0 Si; 0 0 M;/’ 


Type (ii). Qi, Mi, Sit, where QiSit = SiiMi. 


where 


Since any matrix commutative with Q is also a diagonal block matrix parti- 
tioned similarly to Q,* we may treat each block separately. We first consider 
those of type (i). 


4. Type (i). Let H be the identity matrix of the same order as Q; and 
E 0 
let R = g- . Then 
D ji 


(10) Qe, = = Me] = F(Q*s) 


and 


0 0 \Sit O 0 Siz 


Since, by (10), [Qi, Qj] is similar to [Qi, f(Q*i) ], we may replace the block 
(Qi, Q;] in Q by (Ou (Q*;:)]. If this is done, the corresponding block in S 


0 


18, as a consequence of (11), 0 


) . We have therefore 


tESULT (a). Let pi(x) and p;(x) be two distinct characteristic factors 
of the matrix A, which is normal with respect to H. If pi and pj; are the 
companion matrices of pi(x) and pj(x) respectively and if f(p*;) satisfies 
pi(z) =0, then f(p*i) satisfies p(x) =0. Corresponding to these two 
characteristic factors there is in the canonical form of Q a block [Qi, f(Q*:) | 
0 
and in the block 


*John Williamson, “ The idempotent and nilpotent elements of a matrix,” American 
Journal of Mathematics, vol. 58 (October, 1936), no. 4, p. 757. 


le 
|_| 
e 
tO 
). 


340 JOHN WILLIAMSON. 


The matrix Q; in the above is not unique but may be replaced by any 
matrix similar to it. If, however, Q; is taken in the Wedderburn canonical 
form, for instance, the canonical form above is uniquely determined. If p;(z) 
is a characteristic factor of A and if f(p*;) does not satisfy pi(x) = 0, then 
for some value of j, distinct from 1, f(p*;) satisfies pi(x) —0 and f(p*;) 
satisfies pj(x) —0. Therefore we have 


THEOREM 2. Let A be normal with respect to H, so that AH = Hf(A*). 
Let the characteristic factors of A be pi(x), 1 =1,2,- - -,k and let p; be the 


companion matrix of pi(x). If for no value of 1, 1=1,2,---,k, f(p*i) 
satisfies the equation pi(x) = 0, there exists a non-singular matriz R such that 
F 0 0 
(12) RAR and RHR 


The matrix F in (12) is of order one-half that of A while # is the unit matrix 

of the same order as F. The matrix F may be replaced by any matrix similar 

to it and, if #’ is chosen in the Wedderburn canonical form, then F is unique. 

Accordingly (12) gives a canonical form for the matrices A and H when all 

characteristic factors of A are of the type (i). The canonical form (12) is 

determined completely by the invariant factors of A and the polynomial f(z). 
We have as an immediate corollary, 


Corotiary 1. If two similar matrices, which are both normal with 
respect to the same hermitian or anti-hermitian matric H, have all their 
characteristic factors of type (i), they are also H-equivalent. 


5. Proof of lemmas. Before proceeding to the consideration of the 
case, in which the characteristic factors of A are of type (ii), we prove some 
lemmas. 

Lemma 11. Let Q = [Q:, Q2], M = [M,, Me] and S = (Sij), 1,7 = 1,2, 
be similar partitions of the matrices Q, M and S and let QS =SM. If 
S* — eS and S,,; is non-singular, there exists a non-singular matrix R, such 
that RQ = QR and RSR* = T 22]. 

Since 


9 


QiSij = SijMj, (i,7=1,%), 
Therefore, if #; is the unit matrix of the same order as Q; and 


E, 0 ) 
R= 
RQ=QR. Further 


; 
E 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 


R* = — = E, — 
0 E. 0 AO 


Consequently 


and the lemma is proved. 

Let U be the auxiliary unit matrix of order n and V the auxiliary unit 
matrix of order m. Let p(a) be an irreducible polynomial of the ring K[2] 
and p be the companion matrix of p(x). If 6 is a zero of p(x), then the 
totality of matrix polynomials in p with coefficients in K is a field simply 
isomorphic with the simple algebraic adjunction field K(@). If a; = a;(p) 
is a polynomial in p, by a;U we mean the matrix obtained from a;(0)U by 
replacing 6 by the matrix p. In other words, a;U is the direct product of a; 
by U written in a definite way. 

We now consider the matrix equation 


(13) o(U)D=Dy(V), 
where 
D = (dij), 
and 
n-1 m-1 


= + b,U+, = pEm + 
i= j= 
bi = bi(p), cj; =c;(p), b1¢, AO and EF; is the unit matrix of order k. Then 
each element d;; is a polynomial in p and we have 


Lemma 111. Jf D satisfies (13), the first column of D is zero, when 
m>n, and the last row is zere, when m<n. If m=n, D is non-singular 
if, and only if, dy, ~ 0. 


Further, if m =n, so that D is a square matrix, 
D=D,+D,+: Da, 


where the only non-zero elements of Dj; are those d,s for which s —r = j. It 
now follows from (13) that 


n-1 n-1 n-1 n-1 : 
4=1 j=0 j=0 i=1 


and, from the nature of the matrices D;, that 


°For proof see (1) page 363. The proof is the same as that of lemma 4 except 
that b, and ¢; are now polynomials in p. 


341 
il 
|) 
n 
) 
). 
ve 
i) 
at 
ix 
ar 
all 
is 
ith 
he 
me 
lf 


342 JOHN WILLIAMSON. 


8 
(14) Dd baU*Ds-a = CaDs-aU4, (s=1,2,---,n—1). 
a=1 a=1 
Let D, = Dz =-+--=D,.=—0. Then from the first r of equations 
(14) we have 
(15) Do = cx DU", 
If now 


the (r-+ 1)-th equation in (14) becomes 
(16) b,UD, + Dy = ¢,D,U, 


while the remaining equations take the form 


r+1 


a=1 a=1 


or 
r r+1 


If D; is known for j <s—1, (17) may be solved for Ds. In fact if dj 
denotes the element, which is different from zero, in the i-th row of Ds_,, the 
non-zero element in the i-th row of the matrix on the left of (17) is didin 
—c,d;. Since 6,0, de, dz ete. can be determined successively in terms of 
d, and the elements of Dj, 7 <s—1. We have therefore proved 


Lemma IV. If (15) and (16) are satisfied, and Dy is non-singular, there 


exists a matrix 
D=D,+D,+: Drs 


such that 
r+1 
D = = 
4=1 j=1 


6. Case (ii). We now consider the case in which Q; is of type (ii) 
so that QiSii = Si:Mi. For convenience we temporarily drop all suffixes 1 0 
that QS — SM. Let the elementary factors of Q be [p(x)], i= 1, 
where =e, Let be the unit matrix of order e; and U; the 
auxiliary unit matrix of the same order. Then, if p is the companion matrix 
of p(x) and Qi = pH; + Ui, i —1,2,---,k, ++, Qx]. Moreover, 


M f(Q*) = [M,, -, Mx), 
where 


y 


i 


re 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 343 
Mi = f(Q*:) =f (vt) Ee + f (pt) 1) (pt) 
(18) = gk, + 

1 


The matrices a; in (18) are completely determined by the matrix p* and the 
polynomial f(z) and are therefore independent of i. Since Q is similar to M 
and M has the single characteristic factor p(x), the minimal equation of q is 
p(z) =0. Hence p is similar to q and there exists a non-singular matrix y 
such that 


(19) py = 4- 


As a consequence of (3), p* =f (q), so that the a; in (18) are also poly- 
nomials in g and may be written 


(20) aj =a;(q). 


Therefore by (19) and (20) 


(21) yaj(q) = 4;(p)y. 
Let 7; be the counter unit matrix of order e; so that 


(22) Tar 


T = and D=—Sy"T, 
QDyT = DyTM 
= D[yTiMi] -,k), 
= DNyT, 
where, as a consequence of (18), (21) and (22), 
e471 
Ni = + 
j=1 
Therefore 


QD = DN. 
If 


D = (Dij), (4, 4, 
is a partition of D similar to that of Q, we have 
(28) QiDis = DisNi, (i,j 1,2,- ++, k). 


Each of the equations (23) is of the same type as (13) and therefore the 
form of Di; is known. Let e; = = > and let di; denote the 


dj 
ne 
If 
’ 
the 
rik 
ef, 


344 JOHN WILLIAMSON. 


element in the first row and the first column of Dij. If d4, 0, by Lemma 
III, D,, is non-singular. If d,,;—0 and dj; ~0, jc, by an interchange 
of rows and the same interchange of columns we may replace D,, by Dj; 
without disturbing Q or N, so that again we may suppose D,, to be non- 
singular. Finally let di; = - -—dee = 0. Since by Lemma III the 
first column of Dj; is zero, when j >, and D is non-singular, for at least 
one value of j, 1<j Sc, dj; 0. We may suppose without any loss of 
generality that dz; 40. If is any polynomial in p, the matrix 


where J is the unit matrix of order e; + e, +---+ ex, is commutative with Q. 
If RDyTR* = WyT, 


Wir 0 )- yy 0 )( -, 4 
Wa Wo. 0 yT; 0) Dao Doo 0 yT; 7(p*) 


so that 
Wir = Dir + 9(p) Dor + (p*)y* + Doon (p) 


But p*y = f(q)y = 9f(p) = yq* by (3) and (19), so that (24) becomes 


= Dir + 9(p) Dor + Din (q*) + 


Finally, if w,, is the element in the first row and column of W,, we have, 
since = do. = 0, 
(25) W11 = dor + 


Since R is commutative with Q we may replace D by W. Accordingly by 
Lemma III, if w,; 0, W,, is non-singular and we may therefore suppose 
that D,, is non-singular. There remains the possibility that w,, is zero for 
every choice of »(p). This cannot be the case, if the automorphism over the 
field K is of period two. For otherwise we would have in particular 


(26) Ode, + 0 


for every scalar matrix 6, which, when 6=1, yields d.,;—=—dy2. Since 
do, ~0, (26) reduces to 06=6 for every @ in K and this is a contradiction 
of the fact that the automorphism over K is actually of period two. If the 
automorphism over K is of period one and (25) is zero for every 7(p), then 
n(p) =7(q’) for every polynomial and therefore and 
Consequently we have proved 


aut 


sin 


pag 


N 
of 
W 
80 
| 
In 
by 
or 
Sin 
it 
ori 
Le 
ve 
in 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 345 
Lemma V. The matriz D,, may be taken as non-singular unless the 
automorphism over the field K is of period one, dz == — dz, and q= p’. 


Since D,» satisfies (23), it follows from (14) that the principal diagonal 
of Dre is [ dio, a" and that of Da is [ dei, |, 
where for simplicity of notation we have written 


@=,(p) and n= 
Since S* = «8, (DyT)* =e(DyT) and 


= eD,.yT 2, 
so that finally 


(27) (p*) das (p*) edray. 
As a consequence of (19), 

(28) 

so that (27) becomes 

(29) a" (q*) des (q*)y* = 


In the case under consideration, when D,, cannot be taken as non-singular, 
by Lemma V, (29) simplifies to 


a"* (p) doi (p)y’ = edyo(p)y 
or 
(30) a" dey’ €d 


Since dz, = — d,s, this last equation becomes a”1y’ =— ey. When = 
it is possible to determine y so that y = y’ ® and therefore a”? = —e. 

If the automorphism over K is of period one, q = p’ and a”! = —e, the 
original matrix D,, must also be singular. For, corresponding to (30) we have 
=ed,, or = so that d;,—0. Hence we have in place of 
Lemma V 


Lemma VI. The matrix D,, may be taken as non-singular unless the 
automorphism over K is of period one, g=p’ and a,""(p) =—e. Con- 
versely, if the above conditions are satisfied, the matrix D,, is necessarily 
singular. 


By Lemma VI, if D,, may not be taken as non-singular, we may suppose 


*John Williamson, “The equivalence of non-singular pencils of hermitian matrices 
man arbitrary field,’ American Journal of Mathematics, vol. 57 (July, 1933), 
page 490. Cf. Section 8 of the present paper, formula (44). 


7 


346 JOHN WILLIAMSON. 


that = —d,, ~0 and then it is easily shown that D.. ~0. There. 
12 22 
fore either S,, 0 or 5 4 ~ 0, so that by Lemma III we may reduce 
M21 22 


Sir Sie 
Soy Soo 
the above argument where § is replaced by Ts, and finally reduce S to a 


S to one of the forms [S,;, 722] or [( :. Ts: |. We may now repeat 


diagonal block matrix where each block is either of type (i) i - where 
21 O22 
y, g ~ 0 or of type (ii) Si:, where | S;, | 40. 
N22 


Hence we need only consider blocks of type (i) or type (ii) separately. 


Sis | = | Soe | =0, | 


Sis 

So So» 

Y 

S 1 0 S 2 . 

and oo. Since dy, = dos = 0, is nilpotent; 
0 Beso So, 0 

since — dz, ~0, is non-singular; since 


[Qs, Qe] + 02) =(01 + 02)[Mi, Me], Me], (i=1,2). 


Further 


7. Type (i), D,, singular. The matrix ( )=« + o2, where 


[Q1, [M,, M2 |o27 = 0,027 
and 
(0,021) * = = 


Therefore, if 


k= — 0,02"), = V2] 2, 


and 
R O71 R* — 40,02) O71 C2 K— 
2 2 

= + fi; 
where 

= 02 — 30,0210, and = 
4 

Since | ¢2| =| o2 | 0, is non-singular and ¢, is nilpotent of index less 


than that of o;. In fact ¢;==0 moda,*%. If the above process be repeated 
with o; replaced by ¢i, in a finite number of steps we determine a matrix HW 


such that 
2| = and W w* — 


Finally, if X = 


wh 


and 
(31 


Acc 


p 
0 
m 
re 
m 
th 
a 
we 
ele 
th 
4 


d 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 347 


X[Q:, = [Qs f(Q*s)] and 


Since [Q;, f(Q*,) ] is similar to [Q:, Q2], by Theorem 1 we may replace the 
block [Q1, Q2] in Q by [Q1, f(Q*:) |; the corresponding block in the canonical 
0 
FE, 0 

RESULT (b). Let A be normal with respect to H so that AH = Hf(A*). 
If the automorphism over K is of period one, if f(x) =z mod p(x) and if 
a,"* (x) =—emod p(x), where a,(x) =f’ (x), then an elementary factor 


form of S is ( ) . We have therefore 


“[p(x)]" of A must occur an even number of times. Corresponding to each 


pair of elementary factors [p(x)]" of A in the canonical form for A and H 
0 
cal form for a matrix with the single elementary factor [p(a) |". 


occur the blocks [Q,f(Q*)] and ( re). where Q is the Wedderburn canoni- 


The canonical form above is determined uniquely by f(z) and the ele- 


mentary factor [p(z) |". Consequently we have 


THEOREM 3. Jf A and A, are two similar matrices, both normal with 
respect to H, and, if the conditions of result (b) are satisfied for every ele- 
mentary factor of A, then A and A, are H-equivalent. 


8. Type (ii), D,, non-singular. In type (ii), S,, is non-singular and 
therefore D,, is non-singular. Since Q, has the single elementary factor 
[p(x) ]@ the reduction of Q, and S,, will be the same in principle as that of 
a matrix Q which has a single elementary factor. To simplify our notation 
we therefore consider the case in which A and, therefore, Q, has the single 
elementary factor [p(x)]". If p is the companion matrix of p(x), Q has 
the form 

pl + U. 


With our previous notation 


where D, is non-singular. Further, since (p2 + U)D=D(pE + Sai(p)U‘), 
i=1 
UD ai(p) Us, 
and in particular 6 
(31) UD, =a,DU. 
Accordingly, 
D, = [d, da, da,- - -,da""], where a—a,(p) 


1 
n-1 


348 JOHN WILLIAMSON. 
and, since DyT = «(DyT)*, 


dy = «(da"*y)* = ey*d*(a"*)*, 


and 
ady = ey*d* (a"*)*, 
Therefore, 
ay*d*a* = y*d*, or ay*a* = y* 
and by (28) 
(32) aa(q*) =e, 


where ¢ is the identity matrix of the same order as p. 
We now wish to prove 


Lemma VII. There exists a non-singular matrix R such that 
n-1 
RSR* = and R(pE+ 


We shall prove this lemma by induction and accordingly assume that there 
exists a non-singular matrix W, such that 


WSW* —FyT and W(pE+U)W2=pE+U4 


1=2 


where 
Fy + Pri +° 
F, = Do, F; ~0 and F; is of the same form as Dj. 


Since QS = SM, 


r n-1 
(33) (pE+U+ =F (pE + 
where the c; are determined uniquely by the 6b; and the polynomial f(z). 
Consequently 
(34) b,U'*F, = (im 1,2,---,1); 
and 
(35) b,UF, = F,Uc, + 


From (34) and (35) we have 


b,UF — = 6, FU 
and therefore 
(36) b,0G, + = ¢,G,U, 
where 


[ 
| 
r 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 


r r+1 
G=G,+G-,-+ .. + Gri, such that => 
i=1 
Consequently 


r+1 r 


j=1 4=1 


— 
i=1 


Since FyT = «(FyT)*, FiyT = ey*TF*; and therefore 
yT Fy 1*F*, FoF ,yT. 
But 


and consequently 


where L; is of the same form as D;. Hence 
(38) GGo* FyT (GGo*)* = (Po+ 


and (38) that 
JSI* = (Do + Xe 


and 
J(pE + U)J*=pH+U+4+ >» b,U*. 


). form for A and H we may take 


n-1 n-1 
(39) Q—pF+U+D 001, S=DyT, 4+ 0", 
j=1 


determine. 
In place of (34) we have 


from which we get when i= 1, 


UD, — aD U, 


Equation (36) is the same as (16) with D replaced by G and (384) is the 
same as (15). Therefore by Lemma IV there exists a non-singular matrix 


where X; is of the same form as Dj. If J = GG,"1W, it follows from (36) 


Hence, by induction on r the lemma is proved. Accordingly in the canonical 


where Dy = [d, ad, a?d,- - -,a"d] and a=f’(p). The matrices bj and d 
are polynomials in p and the c; are determined completely by the b; and f(z). 
The matrices b; and c; satisfy certain simple equations, which we proceed to 


350 JOHN WILLIAMSON. 


and therefore 


(41) = b,a', 


so that in particular c; =a. 
From equation (36) we obtain by multiplying throughout by y7’ 
(42) = (aG,U — UG,)yT. 
Now 
{.(aG,U — UG,)yT}* = y*T (a(p*) U’G*, — G*,U’), 
= «{a(q*)UG,—G,U }yT, 
= {UG,—aG,U}yT by (32). 
Therefore, by (42), 
— = GoyT )* 
= (q*) GU yT, 
Consequently 
bra (q*) (p) bras (p) (p) eras 
( (p*) = —a"(q) = (9) Cras (p)- 


Since DoyT is hermitian or anti-hermitian the matrix d satisfies 


(43) 


dy = ey*d(p*)a(p*)". 
Further, since py = yq, 
y*p* = q*y* and y*f(p*) = f(q*)y* or by (3) y*q = py*. 
Accordingly 
(44) y* =h(p)y, 
and 
y = y*h(p*) =h(p)yh(p*) =h(p)h(q*)y, so that 


(45) h(p) h(q*) =e 
and finally 
(46) d = ch(p)d(q*) a(q*)"™. 


We have therefore proved 


Result (c). Let A be normal with respect to H, so that AH = Hf (A*), 
and let A have the single elementary factor [p(a)]". Then, if p is the 
companion matrix of p(x), a canonical form for A and H is given by (39). 
The matrices a, bi, cj, d and y satisfy equations (32), (48), (44), (45) and 


(46). 


For the first time the canonical form is not completely determined by 
f(x) and the elementary factor [p(x) ]", since the matrix d(p) is independent 


| 
( 
h 
a 
I 
te 
a 
W 
Si 
(4 
wl 
Si 
ge 
her 
(4 
wh 
R; 
fol 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 301 


both of f(z) and p(x). If the elementary factors of A are [p(z) |” repeated 
exactly s times, a canonical form for H would involve exactly s matrices d,(p), 
d,(p),°**,ds(p), one for each factor [p(xz)]". We have therefore proved, 


THEOREM 4. Jf A is normal with respect to H, so that AH =Hf(A*), 
a canonical form for A, H is Q, S, where Q and S are direct sums of matrices 
given by results (a), (b) and (c). In general the canonical form is not 
uniquely determined by f(x) and the invariant factors of A. 


CoroLuary. Two similar matrices both normal with respect to the same 
hermitian or anti-hermitian matria are not necessarily H-equivalent. 

9. Necessary and sufficient conditions for H-equivalence. If A is 
normal with respect to H, there may exist two distinct canonical forms QS, 
and Q,S. for A,H. If this is so, by theorem 1 there must exist a non-singular 
matrix # such that 

RQ =QR and RS,R* = 82. 


If Q is a diagonal block matrix, where each block has a single distinct charac- 
teristic factor, since R is commutative with Q, R is a diagonal block matrix 
partitioned similarly to Q. Accordingly it is sufficient to suppose that Q has 
a single characteristic factor and has the form used in §6. Further, as a 
consequence of result (b), we need only consider the case in which 

S, = DyT = [DiyT1, DeyTx| and 


where 
= [di, adi, a%*d;] and = [fi, afs,- 
Since 
tS, R* = 82, 
(47) RDR =F, 
where 


R = yTR*(yT)2. 


Sincee RQ = QR, R*Q* = Q*R* and yTQ*(yT)7 R — RyTQ* (yT)7. But 
yTQ*(yT’)~ is obtained from Q by replacing p by g*, so that R is of the same 


general form as hk. If R= (Rij) and (Ri;), i,j 1,2,---°,k, (47) 
becomes 
k 
(48) RiaDakay = 8ijF i, (i,j =1,2,---,k), 
a=1 


where 8;; is the Kronecker delta. 

Let ri; and 7;; denote the elements in the first row and first column of 
D > ° > 
Ri; and Ri; respectively. Then from the nature of the matrices Rij, Rij it 


lollows from (48) that 


352 JOHN WILLIAMSON. 


k 
(49) > = ,k). 


a=1 


If Cc-1 > Cc = Cos. == Ca > Cd+15 


= 0, if < c, Tay = 0, if > d, for all 1,7 —c,c+1,---,d. Hence 


(49) yields 


d 
TiadaTaj = fidij, (1,7 =¢,¢+ 1,---,d), 
or 
d 
(50) 2 = bi;fi(p), i,j 


By re-arrangement of the rows and columns of R, it may be shown that | ri; |, 
i,j =c,c+1,---,d, is a factor of | # | and is therefore non-zero. 

Let @ be a zero of p(x) and Q be the simple algebraic adjunction field 
K(6). Since g* =f(p) and satisfies p(x) = 0, if ¢ f(0) and g(6) is any 
element of ©, the correspondence g(@) <> g(#) induces an automorphism of 2. 
This automorphism is of period one or two and we can therefore define the 
conjugate transposed of a matrix with elements in 2. Equation (49) is equi- 


valent to 
d 
2 = =c,c+1,---,d), 


and this last may be written more compactly in the form 

(51) P[d-(6), da(@) |P* Lfe(4), fa(9) |, 

where P = (rij(9)), 4,7 c+1,--:,d, and P* is the conjugate trans- 
posed of P. Conversely, if (51) is satisfied, so is (50) and the equations 
obtained from (50) by multiplying by a*, s=1,---,ee—1. Consequently, 


d 
iaDat ja(q*) ==, (i, j = + i, d), 
and finally 

W[DeyTc, Des yT evr, DayT a] =([F.yT-, Pes YT FuyT a), 
where W= (rijEc), 4 j=c, c+1,°-:,d. Since |P|~A0, W is non 
singular. 

If there are ¢ distinct powers of p(x) that occur among the elementary 
factors of A, there will be exactly ¢ equations (51). These ¢ equations are 
the necessary and sufficient conditions that the two canonical forms be equ 
valent. Let us for convenience call the diagonal matrix [d.(@),°°-, da(9) J 
the diagonal matrix associated with the elementary factor [p(x)]‘°. We 


therefore have 


( 
fi 
f 
fi 
el 
ca 
m 
re 
e? 
d 
are 
| ly 
Wi 
fo 
du 
tha 
que 
aut 
Sin 
no 
ma 
aute 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 353 


THEOREM 5. When A 1s normal with respect to H, two canonical forms 
for A and H are equivalent if, and only if, the diagonal matrices associated 
with each elementary factor [pi(x)]**4 of A are conjunctively equivalent in 
the field K(0;), where 6; 1s a zero of pi(z). 


THEOREM 6. When A and A, are similar matrices both normal with 
respect to H, they are H-equivalent if, and only if, for all values of i and j, 
the diagonal matrix associated with the elementary factor [pi(x)]**4 in a 
canonical form for A and H 1s conjunctively equivalent in K(6;) to that of 
the canonical form for A, and H. 


THEOREM 7. Let A, be normal with respect to H, and Az normal with 
respect to H,. Necessary and sufficient conditions that there exist a non- 
singular matric P such that PA,P-' =A, and PH,P* = Hz, are (1) the 
invarvant factors of A; be the same as those of Az and (iw) for all values of 
i and j the diagonal matrix associated with [pi(x)]°*4 in a canonical form 
for A, and H, be conjunctively equivalent in K(0;) to that of a canonical 
form for Az and Hy. 


10. Particular fields K. Certain simplifications arise when K is the 
field of all complex numbers and A* is the conjugate transposed of A in the 
classical sense. Results (a) and (b) are practically unaltered but result (c) 
can be greatly simplified. The field K is algebraically closed, so that p is a 
matrix of one row and column, q* = p and QO=K(0) = K. Equation (32) 
reduces to ad = 1 so that a = e*” and (46) yields d = ede"), If d = re‘, 
e¢("-1)i and for a fixed value of = a, ef = + a. Therefore 
d=-+ 7a where @ is uniquely determined. The matrix P* in (51) is the 
conjugate transposed of P and, if dj and = yia, where 6; and 
are real, (51) states that [8,8c4,°°°,8a] have the same index as 
[Yes Wer1y° °°, Wa]. This final result gives Theorem 4 of the paper dealing 
with this special case. The somewhat simpler canonical form found previously 
for this particular case can now be obtained by a simple transformation.’ 

If K is the field of all real numbers there are two distinct types of irre- 
ducible factors p(x), either linear or quadratic. If p(a) is linear, (51) shows 
that [d-,- - -,da] must have the same index as [fc,---,fa]. If p(x) is 
quadratic, K'(@) is the field of all complex numbers. If ¢ = 6, so that the 
automorphism over © is of period one, the matrix P* is the transposed of P. 
Since OQ = K(6) is the field of all complex numbers, there always exists a 
non-singular matrix P satisfying (51) and the conditions imposed by (51) 
may be omitted. In the remaining case when ¢ = 6, the conjugate of 6, the 
automorphism over Q is of period two and P* is the conjugate transposed of P. 


™See footnote (1). 


354 JOHN WILLIAMSON. 


The matrix a is an orthogonal matrix of order 2, y is the identity matrix and 
ab 

the notation a + ib is adopted, 

the arguments used above, for K the field of all complex numbers, may be 

repeated word for word and again (51) is equivalent to the fact that the 


q*a =p. If instead of writing p= ( 


indices of two real diagonal matrices be the same. 

If K is the field of all complex numbers and the automorphism over K 
is of period one, the fact that K(@) = K and P* = P’ shows that the conditions 
imposed by (51) may be omitted. 

The exact conditions imposed by (51) for a general field K are not known, 
but they depend on the arithmetic properties of the field.® 


11. Particular cases of f(x). Some particularizations of the poly- 
nomial f(a) are of special interest. The first of these is when f(z) =z. 
Then AH = HA*, so that 

(AH)* = H*A* = eHA* = 


and the matrix AH is also hermitian or anti-hermitian. Further, if 


= A, and PAP* dH, 
PAHP* = A,A,. 


Therefore by particularizing f(z) to be 2, the results of this paper may be 
applied to the conjunctive equivalence of two pencils of non-singular hermitian 
or anti-hermitian matrices. If «——1, and the automorphism over K is of 
period one, both matrices AH and H are skew symmetric. Since f’(#) = +1, 
a—=-+1 and for all values of n, a”*—-—e. Therefore the conditions of 
result (b) are always satisfied, and, as an immediate consequence of theorems 


2 and 3, we have 


TuEorEM 8. Two non-singular pencils of skew symmetric matrices in 
an arbitrary field are congruently equivalent, if, and only if, the two pencils 
have the same invariant factors.® 

Let f(z) =a and e=1. If pi(x) is a characteristic factor of A and 
pi(a) ~ pi(x), the matrix p*; satisfies p:(x) and pi(a) is a characteristic 
factor of type (i). A characteristic factor of type (ii) is a polynomial p(z) 
*L. E. Dickson, “On quadratic, bilinear and hermitian forms,” Transactions of the 
American Mathematical Society, vol. 7 (1906), pp. 275-292; “On quadratic forms in a 
general field,” Bulletin of the American Mathematical Society, vol. 14 (1907-8), PP 
108-115. H. Hasse, “Symmetrische Matrizen in Kérper der rationalen zahlen,” Crelle, 


vol. 153, pp. 12-43. 
° Cf. John Williamson, “The conjunctive equivalence of pencils of hermitian and 
anti-hermitian matrices,” American Journal of Mathematics, vol 59 (April, 1937). 


No. 2, pp. 399-413, Theorem 7. 


linea 
1936 


1 


(Dee 


1 


Amer 
1 


Chapt 


a 
0 
fe 
a 
eq 
m 
pr 
jul 
are 
ext 
pal 
hee 
ext 
If, 
J 
If 
bet 
| iff 
An 
Wit 
this 


NORMAL MATRICES OVER AN ARBITRARY FIELD OF CHARACTERISTIC ZERO. 355 


which is unaltered by the automorphism over K. Since f(z) =1, a=1, 
q* = p and, as stated earlier, y= 7 and (46) yields d= d. Equations (43) 
now become = =— Cry. Since Q* =f(Q*), and 
accordingly = 0, r=1,2,---,n—2. Equation (51) is not simplified 
beyond the fact that the elements d;(6), fi(@) are unaltered by the original 
automorphism over K. From result (a) and the simplified form of (39) we 
obtain a canonical form Q, 8 for the matrices A, H and therefore a canonical 
form QS,S for the pair of hermitian matrices AH,H under non-singular 
conjunctive transformations. The canonical forms thus obtained differ slightly 
from those obtained in a previous paper,’® as previously the matrices were 
considered as matrices not over K but over the field unaltered by the 
automorphism over K. 

In a similar manner, if f(z) =—vz and e=—1, the conditions for the 
equivalence of two pairs of matrices, one symmetric and the other skew sym- 
metric and non-singular, under congruent transformations may be deduced." 

If f(A*) = (A*)*, AHA®* is a conjunctive automorph of H and the 
problem under consideration reduces to that of the equivalence of two con- 
junctive automorphs of H under similarity transformations by matrices, which 
are also conjunctive automorphs of H. This special problem is the simple 
extension of that of the unitary equivalence of two unitary matrices. The 
particular case of this problem when K is the complex field and «= 17? has 
been dealt with, as well as that in which K is the real field and « = — 1.”% 


12. H singular. It seems unlikely that the above results could be 
extended to the case in which H is singular without some further hypotheses. 
If, for example, 7 is the zero matrix, the fact that A is normal with respect 
to H is vacuous, inasmuch as every matrix is now normal with respect to H. 
If A is normal with respect to H and H is non-singular, the correspondence 
between A and f(A*) is a particular case of an involution ™* and by (3) 


f{f(A)} =A. When // is singular a practical definition of normality would be: 
A matric A is normal with respect to H if AH = Hf(A*) and f{f(A)} =A. 
With this definition of normality, it would be possible to extend the results of 
this paper to the case in which H is singular. Certain complications arise, 
* See 6 page 487. 

11 See John Williamson, “On the algebraic problem concerning the normal forms of 
linear dynamical systems,” American Journal of Mathematics, vol. 58 (January, 
1936), pp. 141-163. 

” John Williamson, “ Quasi-unitary matrices,” Duke Mathematical Journal, vol. 3 
(December, 1937), No. 4, pp. 715-25. 

18 John Williamson, “On the normal forms of linear canonical transformations,” 
American Journal of Mathematics, vol. 59 (July, 1937), No. 3, pp. 599-617. 

“Of, A. A. Albert, Modern Higher Algebra, University of Chicago Press, 1937, 
Chapter V. 


356 JOHN WILLIAMSON. 


for example in the proof of result (a) ; S;; is non-singular and the form of §;; 
would have to be considered in more detail. If this is done, it can be shown 
that result (a) still holds, if # is replaced by U", where n —r is the rank of 
Si;. For the sake of brevity, the remaining results will not be given here. 


13. Characteristic of K different from zero. If the characteristic of 
K is different from two, all results hold except when p(x) is inseparable, 
Then the usual Wedderburn canonical form for a matrix with the elementary 
factor [p(z)]* is no longer available. However, we may use instead the 


matrix 
(52) = pE 


where every element of m is zero except the element in the last row and 
column, which has the value unity. The coefficient of U in Q;**? is 

t t t 

and p pimp’! — Ip = p'*m — mp™. 

j=0 j=0 
Therefore, if h is the coefficient of U in p(Qi), 

ph — hp = p(p)m — mp(p) = 0, 
and h is commutative with p. Hence h is a polynomial in p so that we may 
replace m in (52) by hp or, what is equivalent to this, assume that the 
coefficient of U in p(Qi) is unity. With the notation of § 6 
= gk + 

where the coefficient a of U’ in p(M;) is a polynomial in g. If S = DyT, 
each D,; satisfies the conditions of lemma III except that the elements of Di; 
are not all polynomials in p. Since p(Qi)S =Sp(M;), lemma VI is true 


where 
Do = [d, ad,- - -,a""*d] 


and d is a polynomial in p. Unfortunately, at present we have not been able 
to determine the form that Q assumes when D is reduced to diagonal form. 
The use of the alternative canonical form given by Wedderburn ?° did not lead 
to any more satisfactory results. Accordingly the problem when A has a single 
elementary factor [p(x)]" and p(z) is inseparable is as yet unsolved. 

The case of characteristic two has not been considered. This is always 
an exceptional case and certainly the methods of this paper, which use so often 
division by two, would not apply. It is hoped that the methods used by 
Albert ** may be applicable and lead to a satisfactory solution. 


THE JOHNS HOPKINS UNIVERSITY. 


15 J, H. M. Wedderburn, “‘ The canonical form of a matrix,” Annals of Mathematics, 
vol. 39 (January, 1938), No. 1, pp. 178-80. 

16 A. A. Albert, “Symmetric and alternate matrices in an arbitrary field,” Trans 
actions of the American Mathematical Society, vol. 43 (May, 1938), No. 3, pp. 386-436. 


Pp. 


t 
[ 
re 

| [ 

a 
(J 
8e 

| tiv 

| 


THE WARING PROBLEM WITH SUMMANDS 1 + bx*.* 


By Mary HABERZETLE. 


A positive integer will be said to be represented by [u, w] if it is a sum 
of w positive integers of the form 1 + bz” or 0 and w positive integers of the 
form a(1-+ bx") or 0, where a and 6 are fixed positive integers. In this 
paper wu and w will be determined such that for 19 = n= 400, 1S aS 4n, 
1b S2n-+ 1 every positive integer is represented by [u, w]. The problem 
is a modified Waring problem and was suggested by L. E. Dickson. The 
results obtained are stated in Theorem 3. 

For brevity define po =0, pi =1, po=1-+ 0, pp =1+0-2"%,- -, po 
=1+)(¢—1)". For v > 0 denote by v’ the number of terms, 0, 1, 1+ 3, 
required to represent v, and by v” the number of terms, 0, 1, 1 + b, required 
to represent each of 1,---+,v. For v= 0, define v’ =v’ The notation 
[z] is used to denote the integral part of z. 


1. Lemmas for ascent. Lemmas 1,- - -,4 have been proved by L. E. 
Dickson.* 


Lemma 1. If and all integers in the interval + are 
represented by [A,B], then all in (H, E+ (z+ av) px) are represented by 


[A,B+v]. 


LemMA 2. If 2=1 and all integers in (HE, E + 2px) are represented by 
[A, B], then all in (HE, E+ (2+ w) px) are represented by [A + w, B]. 


Lemma 3. Let R; denote the least positive or zero residue modulo 
a of 7>1. Let Qj = (G;—R;)/a. If all integers im 
(Z,E + apm+) are represented by [R, 8], then all in (LE, EH + pm) are repre- 
sented by [R, S + Qm]. 

Lemma 4. Jf m=4 and if all integers in (H,E + aps) are repre- 
sented by [A,B], then all in (H,E + pm) are represented by [C,D], 
C=A-+ (m—4)(a—1), Qn. 

Lemma 5. Let OC be a positwe integer and define c to be the least posi- 


tive residue of C modulo a. Each of the integers 0,1,- --,C —1 is repre- 
sented by 


* Received June 9, 1938. 
*“ Universal forms Zas:a." and Waring’s problem,” Acta Arithmetica, vol. 2 (1937), 
pp. 178-179. 
357 


| 

§ 


358 MARY HABERZETLE. 
or [(a—1)”, (k—1)”], 
where k = (C —c)/a. The second form is to be suppressed if k = 0. 


Those integers which are = C —c = ka are sums of ka and 0,1,: - -, 
c—1. The latter are sums of (c—1)” terms 0, 1, 1+ 0, and ka is a sum 
of k’ terms 0, a, a(1+ 5b). 

Only when k = 1 are there further integers, ka —ma-+ j (j =0,1,---, 
a—1;k=m=Z1), and these are represented by [(a—1)”, (k —1)”]. 


LemMA 6. Let pp=—qpz,+r, 0Sr< ps. Denote by f, g, and d the 
least positive residues modulo a of p,—r, r, and q, respectively. Then every 
integer in the interval (gps, (q+ 1)ps) is represented by one of 


(1) [a+1+(g—1)”, (=1)+(—)], 


(4) [ (@—1)" +1, 


The second is to be suppressed if ra and the fourth if ps —r Sa. 


(2) [a+ (a—1)", ("4 


By Lemma 5, 0, 1,---,7— are represented by [ @— (=) ] 
if ra, but by it or [@- ay”, (4-1) | if r>a. To these add 


[ a, 


is increased by one for reasons which will appear shortly. 
The remaining integers in the interval (except the final one) are sums 
of gps + 7 = p, and 0,1,- - p;—r—1. These are represented by 


Finally, (¢ + 1)p; is represented by [ a +1, i—*) and hence by (1): 


— which represents gp; = dps + ap;. The weight of (1) 


Lemma %. Every integer in the interval (qps, (q +4) ps) is represented 
by one of the forms obtained by adding a —1 to the first entries of the forms 


(1),-- +, (4). 


( 

| 
ey 
mi 
| 
| 
pre 


ns 


THE WARING PROBLEM WITH SUMMANDS 1 + ba". 359 


The lemma follows from Lemmas 2 and 6. 
2. Representation of integers = gp; — 1. 


LemMMA 8. Leta< ps. Let cand d be the least positive residues modulo 
a of ps and q, respectively. Then every positive integer S qp3—1 ts repre- 
sented by one of the two forms, 
[Uo Vo] = [a—1+ (c—1)”, 
a a 
Vi] = [a—1 (a—1)",(B=* — 1) q—d 


a 
The integers in question are zp, + y, OS e¢Sq—1, 0SySp,—l. 
By Lemma 5 the y’s are represented by 


Since g — 1 ==d—1 (moda) and d=a, wpz is represented by [a — 1, 
(g—d)/a]|. Hence all are represented by [Uo, Vo], [U:, Vi]. 


THEOREM 1. Let lLSas4n,1Sb)=2n+1. Let c and d be the 
least positive residues modulo a of pz and q, respectively. Then every positive 
integer = qp,—1 is represented by 


[ an + a(2n +1) + (a—1)”—1, 
[4n/a] — 2n | 


a a 


In Lemma 8 increase U, to U,. Both (A=) and (eo 1) are 


less than or equal to (2 5. ) . It is a consequence then of Lemma 8 that 


every positive integer = gp; —1 is represented by 


r—[a—1+ (a—1) ) 

In F replace each of [4n/a] + 2n terms aps by pe -+* * *-+ pa (with a sum- 
mands) to obtain H. It will be shown later that 


(6) 


a a 


3. Formulas for ¢ ascents at one step. No proofs will be given for the 
lemmas in this section. References are made to similar lemmas and their 
proofs. 


ry 
§ 
ny 


360 MARY HABERZETLE. 


Lemma’ 9. If g=0 and s=g +a, there exists a positive integer i 
such that 
g9Ss—a(1+ bi") <g4+ 
LemMa* 10. Let g be an integer = 0, L an integer =a, 
G = {(ab)*(L/n) 


If G2=9+L and if all integers between g and g + L inclusive are repre- 
sented by F=[R,8], then every integer s between g and G inclusive is 
represented by F, =[R,S +1]. 


Write +L, v= (1—g/L,)/n, = G. Lemma 10 becomes 
Lemma 11. Let g be an integer =0, L.2=g +a, 
Ly = {(ab) (Lov) 


If L,= Ly and if all integers between g and Ly inclusive are represented 
by F=[R,8], then all between g and L, inclusive are represented by 
PF, = [R, S+ 1]. 


Lemma‘ 12. Let g be an integer =0, v= (1— g/Ly) /n, D2 +4, 
Lou" = ab. Compute Li from 


t 
logio Lt = (. ) + w) — wv, 


w =n logio v—logi, ab. If all integers between g and Ly incluswe are repre- 
sented by F = [R, 8] then all between g and L; inclusive are represented by 
F,=([R,S+ 


Apply the above lemma with a = 1 and J; in place of LZ, to obtain 


Lemma 13. Let g be an integer =0, V= (1—g/L1)/n, Li +1; 
[:1V"=b. If all integers between g and Lz inclusive are represented by 
F, = [R, S + t] then all between g and Lesr are represented by Fret 
= and 


t+T n T 
(7) logio (logio In + w) (*,) (W—w) — W, 


where W =n logio V — logiy b. 


2L. E. Dickson, “ Generalizations of Waring’s theorem on fourth, sixth, and eighth 
powers,” American Journal of Mathematics, vol. 49 (1927), p. 242. 

® Dickson, “ Universal forms Zaia:" and Waring’s problem,” op. cit., pp. 184-185. 

* Ibid., p. 185. 


H\V/ 


= 


THE WARING PROBLEM WITH SUMMANDS 1 + ba”. 361 


Lemma 13 will be used for LZ; so large that 1/n is a sufficiently close 
approximation to V, whence W = — n logy. n — logy, b. But v < 1/n, 


whence W—w > logioa. Since the last two terms of (7) are positive 
n t+T 
(8) logio Ltsr > (+ (logio Lo + n logio v — logy ab). 


4. Results from the asymptotic theory and prime number theory. The 
following theorem is a consequence of a lemma ® proved by L. E. Dickson. 


THEOREM 2. Let ky be the least integer exceeding (log r,)/(—log(1—v)), 
where 1, = n° (6n—1)/(n—d), d=1- 2n?z, and v=1/n. Take 
2k. Then if 1SaS4n, every integer = N, is represented by 
[4n + b —1, 3k, —2] where = bN + 4n+ a(3k, — 2) and logy) N = .8n°. 

Lemma 14. If 1s a positive integer and K;, = [1"/(t—1)"], m2 6, 


n=9, then Ki < X, X = K,+ 5K; + m—n—9-+ nlog(m—1)/4 


5. Solution for 19 =n = 400. 


THEOREM 3. Let lSaS4nand1Sb=2n+1. Let 19Sn= 400 
and denote by c and d the least positwe residues modulo a of pz and q, 
respectively. Then every positive integer is represented by 


H= | 4n-+a(2n-41) + (a—1)”—1, 


a a 
For integers = gp; —1 the theorem is true by Theorem 1. For integers 


= N, it is true by Theorem 2 since 


b= 2m4+1=a(2n+1) + (a—1)”, 


(9) (4—*) 4 [4n/a] — 2n = 3k, — 2, n = 19. 


(# ) => (27+ 1)/2a—1/2 and = 4nlogn + 2n log 6. 


*Ibid., p. 190. 
* Ibid., p. 188. 


8 


1s 
— m—1). 

(5) 
| 
and 
The latter inequality is obtained by noting that 


362 MARY HABERZETLE. 
As a consequence of these inequalities (9) is implied by 
(10) 1 = (96n? log n + 48n? log 6 + 8n + 16n?) /2”. 


For n = 17, (10) is true, and since the right-hand side decreases as n increases 
it holds for n = 17. 

Since 3k, — 2 > 0, it is now seen that (6) is true. 

It remains then to prove Theorem 3 for integers = gp; and < N,. This 
is done by ascent. From Lemma 4 and the first part of Lemma 7 all integers 
in the interval J = (qps,qps + are represented by [K,D], where 


m+1 

K=A-+ (m—2)(a—1) and D=B+ > Qi, while [A, B] was initially 
i=4 

one of the forms (1),: - -, (4) but may be replaced by one of 


(a—1)", (=2)"4 


a 


Apply Lemmas 12, 13 with 9 = py, Lo = pms, both in J. The conditions, 
Lov" = ab, Li V" = b, will later be seen to hold after m is restricted. Let Z 
be the least integral value of t+ 7 for which the second member of (8) 
exceeds log;, V,, where N, is the constant in Theorem 2. It will be shown 
that [T+ K, t+ Sd. 

For the first case from (11), let 


(11) 


(12) T =4n + a(2n + 1) —d—2 — (a—1) (m— 2) — 2a. 
Then 


t+ (=2)"+ [4n/a] — 2n. 


Multiply both sides of the inequality by a, substitute aZ — aT for at, and then 
replace 7’ by (12). This gives 
aZ — 4an —a*(2n +1) + ad+ 2a-+ a(a—1)(m—2) 
+ 2a? + (2—*) — a[4n/a] — 2an. 


a 


Increase (2) ” to (r—g)/a(1+ 6) +5 and [4n/a] to 4n/a. Then 


—aZ + 4an + a?(2n + 1) — ad — 2a — 
m+1 


(13) —a(a—1)(m—2) —ab—a + 
— 4n — 2an—c/(1+ 5). 


( 

( 
P 

( 
n 
N 
co 
th 
to 
b( 
or 
(1 
Fr 
(1 
By 
In 
(1! 
In 
div 


THE WARING PROBLEM WITH SUMMANDS 1 + bz". 363 


Write = aZ — 4an — a? (2n +1) + 2a + 2a? + a(a—1)(m—2) +b 


aX +c/(1+ 6) +4n-+ 2an. In (13) decrease g to 1. Then 

(14) (r—1)/(1 +b) S—B—ad + (145-2")/(1 4). 
Similarly, for the second case from (11) let 

(15) T =4n+a(2n+ 1) —2— (a—1)(m—2) — 2a. 

Proceeding as above, one obtains 

(16) 


Let m= 2n. Then for Lo = = 1+ 0(2n)", Lov" > ab. For, when 
n=19, (1+ 6(2n)")v" is approximately equal to (1 + b(2n)")/n™ > 2b. 
Now 2%) > 4nb = ab when n = 5. It follows from this that L;,V" > 6b. The 
conditions in Lemmas 12 and 13 are therefore satisfied. 

For T in (12) and (15) it will now be shown that? =Z—T=0. For 
this purpose Z will be determined such that log,, Li,7 > logio Ni. It suffices 
to take In (8) take v—=1/n, decrease Lp) = 1-+ b(2n)" to 
b(2n)" and increase a to 4n, thereby increasing ¢-+ 7. Then 


t+T 
(4) (n logio 2 — login 4n) > n°, 
or 


(17) > —logro logo — logie 4n) 


logio — logig — 1) 


From (12) and (15) it is seen that T >0O and <10n. The numerator in 
(17) is > 5log,,. It is therefore sufficient to show that 


5 logio n 
logio n — logio (n —1) 


t+T> > 10n. 
By computation this inequality may be shown to be satisfied for 19 S n S 400. 
When 19= n= 400 the two conditions (14) and (16) are satisfied. 

Increase E to 

2n+1 
(18) B, = 4nZ + 22n—1+a Dd Qi + 8n*. 

i=4 
In (14) increase *—1 to r and a to 4n, multiply both sides by 1+ 6 and 
divide by 1+ 2"). Then (14) holds if 


r/(1 + 2"b) S1— (B, + 16n?) (1+ b)/(1 + 


m+1 


364 MARY HABERZETLE. 


Similarly, (16) holds if 
r/(1 + 2"b) = + 4n) (1+ b)/(1 + 2”). 


Now 


(1+b)/(1-+ 2") S2/(1 4 2"). 


The two inequalities above will therefore hold if 


(19) 2H,+8n_ < + 16n*) 
1+ 2" 2" — 1+ 2" 


The decimal part of (1 + 3"b)/(1 + 2b) is r/(1 + ; 
(1 + /(1 + 2") — (3/2)" — ((8/2)"—1)/(1 + 2"). 


The latter fraction decreases as n and 6 increase. When n = 19 it is at most 
.004*, n = 20 at most .003*. Hence the decimal part of (1 + 3"b)/(1 + 2"b) 
agrees with that of (3/2)" to an increasing number of decimal places. 

Let n»=19. From (17), Z may be taken to be 302. From Lemmas 3 
and 14, 


2n+1 2n+1 


= [(3/2)"] + < [(8/2)"] 


X < 634.645589. Also, [(3/2)?*] S 2217. Hence, #, S 28708.645589. From 
(19), 111330 <1r/(1 + 2") <.866925. Since 1—2(E, + 16n*)/(1+ 2") 
increases with n and (2#,-+ 8n)/(1+ 2") decreases when 1 increases, 
Theorem 3 will be true for any n = 19 for which r/(1 + 2b) lies between the 
above limits. The condition is satisfied for n = 19, 20. 

Let n = 20. Then Z = 323, X¥ < 802.08, [(3/2)2] 3326, and 
EB, < 33607.08. Hence, .064253 < r/(1 + 2b) .923692. This holds for 
n = 20, 21,° -, 28. 

Let n = 29. Now Z = 522, X < 7520.92208, [ (3/2)?9] S 127840, and 
FE, S 203277.92208. Hence .000758 S r/(1 + 2b) S .9991925. This is true 
for 29 = n= 400. 


THE UNIVERSITY OF CHICAGO. 


For 
“ A 


t 
0: 
of 

ge 
th 
i=4 al 
fo 
C0 
go 
eX 
ac’ 
me 
cla 
bet 

as 

(19 


ON THE UNITS OF INDEFINITE QUATERNION ALGEBRAS.* 


By Hutt. 


1. Introduction. The unit groups of maximal orders (integral sets) 
of definite quaternion algebras are finite groups which are easily determined 
for any given maximal order. In the indefinite case the unit groups are in- 
finite and very little has been known of their structure until the recent work 
of Kichler," whose results may be regarded as complete for those groups which 
contain no element 4 + 1 of finite order. In case the group contains non- 
trivial elements of finite order, Kichler determines the structure of an in- 
variant subgroup, of finite index, free of such elements, from which the whole 
group is afterwards obtained by adjoining a suitable unit. It is the purpose 
of the present paper to present a method which applies equally well to both 
cases, and which has the added advantage, not possessed by Hichler’s method, 
of providing the generators of the groups when desired. 

The method will be indicated for maximal orders associated with canonical 
generations * of the algebras. It consists in associating the unit group with 
the norm form of the maximal order, written as a binary Hermitian form over 
an imaginary quadratic field. The units correspond to automorphs of this 
form and thus give rise to a principal circle group of transformations of the 
complex plane, for which, by methods due to Humbert,*® a fundamental poly- 
gon may be constructed, and generators found. These are exhibited for some 
examples in the last section. 

It is found that the principal circle groups involved are of finite char- 
acter * {h,n}, where h is the genus of the surface associated with the funda- 
mental polygon, and n is the number of its cycles of elliptic vertices. The 
class number n is evaluated by means of theorems of Kofinek ® on relations 
between the ideals of maximal orders and those of splitting fields of the 


* Received April 18, 1938. 

Eichler, Mathematische Annalen, vol. 114 (1937), pp. 637-654, cited hereafter 
as 

*These are due to Albert, Bulletin of the American Mathematical Society, vol. 40 
(1934), pp. 164-176, cited hereafter as (A). 

*Humbert, Comptes Rendus, vol. 169 (1919), pp. 205-211. 

‘Fricke-Klein, Automorphe Funktionen, Chapter 3, cited hereafter as (F-K). 

5 Kofinek, Mémoires de la Société Royale des Sciences de Bohéme, No. 1, 1932. 
For references to similar results of Hasse, Chevalley and E. Noether, see Deuring, 
“Algebren,” Ergebnisse der Mathematik (1935), cited hereafter as (D). 

365 


366 RALPH HULL. 


algebras. Lichler’s results give the non-Euclidean area of the fundamental 
polygon, and this, with general formulas of Fricke and Klein, subsequently 
gives the genus h. 


2. The associated Hermitian form. Let © be a rational indefinite 
quaternion algebra with fundamental number o > 1. Then oa is a product 


(1) o= 9192" * * Yer 


of an even number of distinct rational primes, and © is a division algebra of 
discriminant —o?. A canonical generation of is of the form = [1, 1, j, ij], 
i? == — p, j? =a, where p is a prime such that 


(2) (qs|p) (s=1,- ',2r), (o|p) 


We also write 2 = (co, 8), where 3 is the quadratic field R(1), 1? =—p. 
The congruence 


(4) 4u? =o (mod p), 4u? —o = kp, 
has an integral solution » which gives rise to a maximal order ° 
(5) M [1,0,J, Jo], o= (14 1%)/2, J = j)0/p. 


The “similar ” maximal order: i*?i corresponds to the solution — yp of (4). 
We fix » and write the general element v of Mt in the form 


(6) v=€+dn, = + = Yo + 


where the z and y are rational integers, and é, » are integral elements of 3. 
The multiplication table of the basis (5) is readily computed and, in 
particular, we have the matrix representation of © corresponding to this basis: 


| 


where & denotes the conjugate of é in 8. The norm N(v) is the determinant 
of the matrix (7) which may be written as the Hermitian form 


v4 = J 


(8) N(v) = N(E+dn) = — om! /p, 


and represents only rational integers for é, » integral in 8. The units of Jt 
are its elements of norm +1. These evidently form a group which will be 


denoted by §. 


* We use a slight modification of the form in (A). 


( 

( 
I 
0) 
(1 
in 
| 
(1 
git 
| : 
wh 
21) 
(1 
Fo 
{= 


ON THE UNITS OF INDEFINITE QUATERNION ALGEBRAS. 367 


Kichler has shown’ that § contains elements of norm —1 for every o, 
and it is clear that the elements of norm + 1 form an invariant subgroup §,, 
of index 2 in §, and such that 


(9) 9 {t1, i, 
where wu; is an arbitrary unit of norm —1. Henceforth we restrict attention 


to 


Let w be an arbitrary unit in ,: 
(10) u=atZJB, N(u) = 1, a and B integral in 8. 


It follows readily from (7) that uv = («+ JB) (é+ Jy) = & + Im, where, 
on writing in accord with (8): 
(11) y=a+ 2MptB/p, + /p, 
we have 
(12) = yf + of’n/p, 
m = BE+ — oBB'/p = 1. 
The form (8) is evidently invariant under (12). Although neither is integral 
in 8 we shall call the latter an automorph of the former and have 


THEOREM 1. The group H,, can be represented as a group of automorphs 
(12) of the Hermitian form (8). 


3. The principal circle group. We now consider the transformation 


\ ny 72 + of’ /p , 
13 — oBB’/p = 1, 
(13) vy 
of the complex z-plane, corresponding to the unit (10). Direct computation 
gives 
(14) —o/p = —o/p)/| |’, 
where | Bz + y’ | denotes the absolute value. We see that | z+ y’ | #0 for 
2 interior to or on the circle 
(15) C: 22’ =a/p. 


For, if B=0, | Bz +y|=—y/ > 0, and if | Bz+ | =O implies 
t=—'/B, zz’ = (p+ oBB’)/pBR’ by (12). Hence T(z; u) trans- 


forms € into itself, its interior into its interior.* 


"(E), p. 644. 
*Units of norm —1 carry an interior into an exterior point. 


368 RALPH HULL. 


We denote by U the totality of transformations (13) corresponding to 
elements of §,. Then U is a group and we shall prove 


THEOREM 2. The group 9, is homomorphic, in a two-to-one corre- 
spondence, with the group U, which is a Fuchsian group having the principal 
circle (15). 

The first part of the theorem is obvious since T'(z;~) = T(z;—u) and 
N(—u) = N(u), and N(w) = 1 implies that a and B have g.c.d. 1 in 3. 
To prove the last part we have only to show ® that U is properly discontinuous 
on the interior of ©. At the same time we give formulas which are useful 
for finding units. 

We write 

+ dw, B= bo + by, 
X = —y) = pa + 2p(2bo + bi), 
(16) Y= B+ 2b + bi, 
y = 2) + ay — 


Then X,- - -, W are rational integers such that 
X? + pW? —o(Y¥* + pZ*) = 4p, 


17 X = 2uY (mod p), X= W, Y =Z (mod 2), 
(17) 4w(8) 4pN(y) —X2-+ 


It is evident from (17) that at most a finite number of units correspond 


to a given 8. We have also 


(18) | Be +7 | 
|) 
> | (1 — + | |). 


For 22’<a/p, it follows from (17) and (18) that | Bz+y|7o% a 
| 8 |—> © since the last fraction in (18) has a finite limit #0. Hence, by 
(14), the transforms of an arbitrary point interior to © approach the circum- 
ference of € as N(B)— «. It follows that a cluster point of such trans- 
forms is necessarily on the circumference '° and U is properly discontinuous 
on the interior of ©. Hence WU is a Fuchsian group with principal circle 6 
by the definition of such groups. 


®Our proof is essentially that of Picard, Annales Scientifiques de V’Ecole Normale 
Supérieure, vol. 20 (1884), p. 50. 

1° That, conversely, every point of the circumference is a cluster point of trams 
forms of an interior point follows indirectly from later results, no. 5. 


( 

| 

0 

f 

la 

E 
tl 
hy 

gl 

wl 

as 

¢, 

(2 

infi 

son 
Fo 

in 


ws 


ON THE UNITS OF INDEFINITE QUATERNION ALGEBRAS, 369 


4. The fundamental polygon. We wish to apply Humbert’s practical 
formulation of the method of “ rayonnement” to find a fundamental domain 
for Ul interior to ©. For this purpose, it is necessary to assume that z= oo 
is not a fixed point of a substitution T(z;u), wA+1. This is evidently 
equivalent to the assumption 


(19) UA +1 implies B~0. 


If p= 3, we have the unit u= a= (—1+1)/2,B=0. If p>3, av’ =1 
has the only integral solutions «= + 1. We assume" henceforth that p > 3. 
Then we have (19) and ?? 


THEOREM 3. If p> 3, the group U has a fundamental domain, for the 
interior of ©, consisting of that part of the interior of © which is exterior to 
all the circles 
(20) | P=1, 


as u ranges over §,. 


The circles (20) are orthogonal to © and are called ** the isometric circles 
of the substitutions of U. The boundary of the fundamental domain is a 
polygon whose sides are arcs of certain of the circles (20). We denote this 
fundamental polygon by §. 


5. Properties of the fundamental polygon. We require certain formu- 
las, related to the polygon $8, which are obtained by combining the results of 
Kichler ‘* with well-known formulas from the theory of Fuchsian groups and 
the associated hyperbolic geometry. 

The interior of ©, with its orthogonal circles as 
hyperbolic plane of the geometry of Lobatchefski. For this geometry, U is a 
group of displacements preserving the projective metric. Hichler, in a some- 
what less explicit way than that employed in the present paper, represented 9 
as a group of non-Euclidean displacements of the interior of the unit circle 
€,. He found, by analytic methods, the formula 


‘ 


‘straight lines,” is the 


(21) W = r(c)/3, 


"This is a restriction only when R(i), i? =— 3, splits Q. For any ¢, there exist 
infinitely many p giving canonical generations. 

Humbert, loc. cit., 3). 

*% Ford, Automorphic Functions, p. 23. 

“ (E), p. 649. Cf. Humbert, Comptes Rendus, vol. 171 (1920), pp. 377-383, for 
somewhat similar results obtained by the same methods directly for Hermitian forms. 
For the general theory of the polygons associated with Fuchsian groups, see (F-K), 


In particular: p. 262. 


g 


370 RALPH HULL, 


where ¢ is the Euler function, for the non-Euclidean area % of a fundamental 
domain interior to ©;. We obtain ©, from € by the linear transformation 


(p/o)*z, 


which preserves the projective metric. Moreover, the non-Euclidean area of a 
fundamental domain for U interior to € is independent of the particular choice 
of the domain since U also preserves the metric. It follows that (21) gives 
the area 2% of the polygon $. 

On the other hand, the projective area of a polygon interior to © is given 
by the formula 
(22) — (m—2)r—3, 


where m is the number of sides of the polygon (arcs of circles orthogonal to 
@), and & is the sum of its angles (measured in the elementary sense). We 
combine (21) and (22) to obtain 


THEOREM 4. Let m be the number of sides, & the sum of the angles, 


of Then 
(23) ap(o)/3 = (m — 2)r— 


The number of sides of $ is even: m = 2t, with the understanding that 
two sides, forming the angle z, may lie along the same arc in case a “ non- 
apparent ” vertex of $3 occurs. This corresponds to a fixed point of an elliptic 
substitution of order 2. We proceed to a further study of the vertices of $. 

The finiteness of 2 implies that $8 has at most parabolic vertices on G, 
and we now show that such do not actually occur. For, a parabolic vertex of 
§ is the fixed point of a substitution T(z;u),uA +1, withy+7’=—0. By 
(13) such a fixed point is of the form z = (y — y’)/2B, and 22’ = o/p implies 
N(y—y +278) =0. This is impossible since © is a division algebra. It 
follows that the cycles of vertices of % are of at most two types: (1) adventive 
(zufallig) vertices, for which the sum of the angles is 27 = 2a/k, k=1; 
(2) elliptic vertices, for which the sum of the angles of a cycle is 2x/k, k > 1, 
where & is the order of the elliptic substitutions of the corresponding class. 
In the next section it will be seen that type (2) may or may not occur and 
that, for this type, & can have only the values 2 or 3. We denote by m1, m2, Ms 
the number of cycles of vertices of $$ for which k —1, 2, 3, respectively. 
Evaluating &, and inserting the result in (23), we obtain 


(24) $(c) = 6(t — m,) — 6 — 


1° This property of 2 is also essential in Eichler’s proof of (21). 


( 
g 
( 
st 
tr 

n 

od 
ace 
if 
§, 
OVE 
firs 
one 
wil 
are 
dis 
casi 
and 

spec 
Mat 


ON THE UNITS OF INDEFINITE QUATERNION ALGEBRAS. 371 


By the usual identification of equivalent points on the sides of $8 a sur- 
face is obtained whose genus h is given by the formula *° 


(25) 2h —1 = t — m, — — m3. 
We combine (24) and (25) to obtain (26) below and have 


THEOREM 5. The polygon $8 1s associated with a surface of genus h 
given by 
(26) 12(h —1) = ¢$(c) — 3m, — 


where mz and mz; are the number of cycles of elliptic vertices of $B corre- 
sponding to classes of elliptic substitutions of orders 2 and 8 respectively. 


6. The classes of elliptic substitutions. We recall that two elliptic sub- 
stitutions of a Fuchsian group are said to be in the same class if one is the 
transform of the other by a substitution of the group. For the group U, the 


~ 


numbers mz and ms, defined in Theorem 5, have the values given in 


THEOREM 6. The group UV has elliptic substitutions at most of orders 
k=2andk—3. The case k = 2 occurs, that 1s mz > 0, if and only tf every 
odd prime divisor of o is =3(mod4) and then mz= 27" or mz = 27, 
according as 2To or 2 | 7. The case k = 3 occurs, that is mz > 0, tf and only 
if every prime dwisor 3 of o 1s =2 (mod 3) and then mz = 27" or mz = 271, 
according as 3¢o or 3|o. In both cases, r is defined in (1). 


An elliptic substitution T(z; ) of U corresponds to a unit u4 + 1 of 
§, which is of finite order and hence a root of unity. Since © is of degree 2 
over 2, O can contain at most fourth or sixth roots of unity. This proves the 
first sentence of the theorem since T'(z; uv) = 7(z;—w), and in case k = 3, 
one of the units + w is a cuhe root of unity. In view of the last remark it 
will be sufficient to give details only for the case / = 2 since those for k = 3 
are exactly similar. 

By known theorems *’ on the splitting fields of algebras, a quadratic field 
§ splits O if and only if no prime divisor of o is the product in § of two 
distinct prime ideals. This gives at once the criterion of the theorem for the 
case k == 2 (and similarly for k = 3) by the known laws of factorization in 
3= (0), 6?——1. We assume henceforth that the criterion is fulfilled 
and proceed to evaluate mz. 


#° (F-K), p. 262, formula (2). 

General theorems are due to Hasse, Cf. (D), pp. 117-118. For the present 
special case of quaternicn algebras elementary proofs are given by Latimer, Duke 
Mathematical Journal, vol. 2 (1936), pp. 681-684. 


372 RALPH HULL. 


First, let w and wu’ be any two units of $, such that 7'(z; uw) and T(z; w) 
are elliptic of order 2. Then %, = R(u) and %2— R(u’) are quadratic sub- 
fields of isomorphic under the correspondence u< u’. Hence there exists 
a regular element v of Q such that vtuv =u’. Write Yt, = vPtv-, and let m 
denote the unique maximal order of §,. Without loss of generality, we may 
assume that v is in Yt. Then v¥Yt is an Mt-right-ideal whose left order MN, 
contains m= vmt'v', where m’ is the maximal order of %2. Hence the 
distance ideal D = Mt,)-? where a is an m-ideal. We have also 
M, = DMD*. Since L(G), 6 ——1, has class number 1, a is a principal 
ideal: a= am, « in m, and Yt, = aMta-*. A rational number a can be chosen 
so that aa*v—v, is a primitive*® element in Mt. Then v,uv, =v, 
01° Mo = vtaMatv = v Mv = M, and v, Mt = Mv, — J is a primitive two- 
sided ideal of M%. This proves the 


Lemma. Let u and w’ be any two imaginary fourth roots of unity in N. 
Then there exists a two-sided primitive integral ideal 3 of Mt such that ¥ isa 
principal ideal and 
(27) == UM == Mev, v = 


In view of the Lemma, to complete the proof of Theorem 6 we have to 
count the two-sided primitive ideals YJ of Mt, and then determine when the 
substitutions T(z; w) and 7'(z; u’), corresponding to uw and wu’ in (27), are 
in the same class. 

The primitive two-sided ideals 7! of Mt are the 27” products 


(28) == * Jor", = 0, 1 (s == 1,- 


where Miqs = qsMt = qs, Is? = qs, for the factors (1) of «. Since ”? the class 
number of © is 1, each Yi: (28) is principal: Y(e) = v(e) Mt Mtv (e). 
Let 3 = vM be a fixed ideal (28) and suppose that uw) and wv’), 
u’ =v" uv, are in the same class. Then there exists a unit é of , such that 
wu Since év-'uvé! u, and R(u) is a maximal sub-field 
of ©, it follows that is in R(u). But is integral and N(vé") 
= N(v)N(é"*) = N(v) = der, from which it follows that at most 
the exponent « corresponding to q = 2 can be ~ 0 since the remaining q’s are 
indecomposable in R(w). Thus, if 2¢0, N(v) =1, v is a unit, and m, =. 
If the right-ideal 3, = (1 + u)M, 2, is necessarily two-sided 


18 (D), p. 42, Theorein 3. 

1 Kofinek, loc. cit., 5), p. 9, Theorem 3. 

*° That is, not divisible by a rational integer > 1. 

*1 See (D), p. 88, Theorem 2. 

22 Kichler, Journal fiir Mathematik, vol. 176 (1937), p. 192. 


| 
1 

( 
u 

( 
T 

di 
in 

Ei 

24 
wh 

dig 
the 

10 

ma 

are 


ON THE UNITS OF INDEFINITE QUATERNION ALGEBRAS. 373 


since 2 has only two-sided ideal factors in Mt, and hence (1+ u) Mt = vM, 
v=(1+4u)y, =v (1+ u)u(1+ u)y = where is a unit 
of §,. In this case evidently mz, = 2°"-'. This completes the proof of Theorem 
6 for k = 2. 


7. The structure of the unit groups. We have seen that © is of the 
form (9) and 
(29) §,= {— 1, U1}, 
where U is a Fuchsian group of finite character {h,n}, n= mz2-+ ms, 
12(h —1) = 4m;, and mz and mz are given by Theorem 6. 
An immediate consequence of the general theory ** of Fuchsian groups is 
THEOREM 7%. The group U has a canonical set of generators: 
(30) U = Vis? ORs V's, vn}, 
which satisfy the only essential relations 


n 


h 
(31) IT we 05720’ = 1, = 1, k = 2 or 3. 
j=1 


The group § 1s of the form (9), (29), (30) and (31). 


In case mz = 


mM; =n = 0, that is, by Theorem 6, if at least one prime 
divisor of o is ==1 (mod 4) and at least one = 1 (mod 3), the generators u 
in (80), (31) are lacking and the form of Theorem 7 coincides with that 
given by Hichler.2* In case n0, that is at least one of m. and m,~0, 
Eichler studied a congruentially defined sub-group $, of ©: a= 1 (mod q), 
where q is a suitable g in (28). The corresponding U, is of character {hq, 0}, 
24(hga—1) d(c)(g +1), N(q) and 


= {y, Ug}, in Ug, = Ua; 


where 1, has generators v, v’ as in (30) and (81), with h replaced by hg. 

In general, the polygon $8 of 4 does not give the generators (30) imme- 
diately. They may subsequently be found by the usual transformations from 
the generators associated with §$ and the relations satisfied by them. 


8. Examples. Fora fewsmall values of o we give some results of actual 
somputation of generators of the groups U. These are determined for canonical 
maximal orders Yt, but the unit groups of all maximal orders of a given O 
are isomorphic since Q has class number 1. 


— 


* (F-K), pp. 186-187. 
* (E), p. 650. 


374 RALPH HULL. 


The units are obtained from solutions (X,Y,Z,W) of (16) and (17) 
and it will be convenient to speak of the unit (X, Y,Z,W) and of the (iso- 
metric) circle (X,Y,Z,W). We note that the units (X,Y,Z,W) and 
(X, Y,— Z,— W) are inverses and one carries its own circle into that of the 
other. Also, the circles (X,Y,Z2,W) and (X,Y,—Z,W) are symmetric 
with respect to the axis of imaginaries. The cases k = 2 and k = 3 of Theorem 
6 correspond to W = 0 and W = + 1, respectively. By analogy with the Pell 
equation, it is to be expected that the sides of $ will correspond to small values 
of 4N(8) = Y*+ pZ*. However, certain units may be found, before § is 
closed, which do not contribute to the boundary of %8. Such units are con- 
tained in the subgroups generated by units found earlier. In some cases, e.g. 
o = 10, this increases the computations. The examples follow. 

o=6, p=19, p=—7, {h,n} = {0,4}, 

$3 is bounded by the five circles (10, 2,0,0), (1,4,0, + 3) and (—14,1,+1,0). 
The first contributes two sides since the point at which it crosses the axis of 
imaginaries is a fixed point of the corresponding T'(z;w). In this case, U can 
be generated by three of its substitutions (and their inverses). Canonical 
generators U,,°°*,U, are (10,2,0,0), (15,3,1,—1), (14,—1,1,0) and 
(— 9, 2, 0,1), respectively. 

II. o=10, p=43, {h,n} = {0,4}, =—4. 


For o=10, p=3 gives a canonical generation of @ but is excluded by 4 
% is bounded by (—18,2,0,+1), (17,4,0,+ 1) and (8,12,0,+6). Canoni- 
cal generators +, Us are (—13,2,0,1), (17,4,0,—1), (47, 6, 2,—1) 
and (43, 0,2,—1), respectively. 


III. In each of the cases o—14(p—11) and o=15(p=7), $ has ten 


sides, with a non-apparent vertex in case o = 14. 
IV. o= 26, p=11, p= 1, {h,n} = {2,0}, m—m, —0. 


This is the smallest case of « for which n =0. § has fourteen sides and its 
vertices are grouped in four cycles, two of three and two of four vertices each. 

The writer has been unable to determine generally the number of sides 
of %. This, with the number m, of non-elliptic cycles of vertices, was elim! 
nated from (24) and (25). Inasmuch as $8 is a special domaine rayonné, 
with its “center ” coinciding with the center of ©, it is possible that, with a 
different choice of “center,” a polygon more suitable for finding canonical 
generators (30) and (31) could be found. 


THE UNIVERSITY OF JLLINOIS. 


of 
and 


1 
ft 
a 
ec 
W 
th 
pl 
4 
ti 
an 


NON-ABELIAN COMPACT CONNECTED TRANSFORMATION 
GROUPS OF THREE-SPACE.* 


By DEANE MontcomMeEry and Leo ZIpPin. 


1. In this paper we conclude our study of compact connected effective 
transformation groups of Euclidean three-space (hereafter, Tg(H;)) with a 
final theorem on the non-abelian case. We have already established and shall 
use the result that: 


1’) The only abelian Tg(E;) is, abstractly, the group of rotations of a circle." 
Moreover, if K denotes such a group, the space EH; must admit a coordinate 
system in which K 1s the group of all rigid rotations of EF; about a fixed axis.” 


This will now be complemented by the following theorem: 


1”) The only non-abelian Tg(E;) is, abstractly, the group of rotations of 
the two-sphere.® Moreover, if G’ denotes such a group, the space EL; must 
admit a codrdinate system in which G’ is the group of all rigid rotations of 
EF, about the origin. 


The plan underlying our proof is to show that a group G’, as in 1”, 
contains at least two distinct groups as in 1’ and that there exists a single 
codrdinate system in #, which is appropriate to these two groups. Interlocked 
with this argument there is a topological analysis of the orbits of points under 
the group generated by these two, by which we achieve a reduction of our 
problem from three- to two-space. 


1.1 We remind the reader that a topological group G is called a transforma- 
tion group of a topological space R provided that with each element g of G 
and point x of FP there is associated a unique point g(x) of R and: 


i) the association g(x) is simultaneously continuous in g and 2, 


ii) when gp is the identity of G, g(x) = 2, 
ili) when gs = 92(%) ] = 9s (2). 


* Received June 18, 1938. 

*We shall call this the circle-group: it may also be realized as the additive group 
of real numbers modulo one. 

*In cylindrical codrdinates the fixed axis will be the line r= 0, and the elements 
of K will be the transformations: 0 <1). ‘Seer (4) 
and (5) of the bibliography at the end of the paper. 

*The structure of this group is well-known. 


375 


= 


376 DEANE MONTGOMERY AND LEO ZIPPIN. 


The transformation group is said to be effective if 
g(x) for all « 
is characteristic of the identity element only.‘ 


1.2 We recall the obvious consequences that each element of G@ is a homeo- 
morphic mapping of FR onto all of # and that for each x the “ orbit” G(z) 5 
the set of all “transforms” of z under G, is homeomorphic to a coset-space 
of G (depending, in general, on 2). Accordingly, if G is compact and con- 
nected, so is G(x). Always, G(x) is a “ strongly-homogeneous ” subset of R. 
This means, by definition, that if y and z denote two points of G(a) there 
exists at least one homeomorphic mapping of F onto itself under which ((z) 
is invariant and y is carried to z. 

We remark that if G is an effective transformation group of R, all sub- 


groups of G@ are likewise effective. 
2. We now prove a general group-theoretic lemma. 


LemMA 1. Every compact connected non-abelian group G’ contains at 
least two distinct compact connected abelian subgroups K, and Kz. 


It is known (7) that such a group G’ must contain (arbitrarily small 
compact abelian invariant) subgroups G* for which G’/G* is a non-abelian Lie 
group. The Lie group G/G* must (see e. g. 2), contain a one-parameter sub- 
group. The closure of this we will denote by &. The group F is an abelian 
Lie group and therefore a loral group (6,7). Then / contains an element 
f whose powers are everywhere dense in F. Let f be an element of @ in the 
“coset” f. The powers of f form a group whose closure F is abelian. The 
factor group of F by its intersection with G* must “cover” #. Then it is 
clear that F cannot be totally disconnected and must contain a connected 
subgroup K,. Now G/G* cannot coincide with F (which is abelian) and must 
therefore contain a one parameter group with elements not in #(2). From 
this second group we obtain, quite as above, an abelian group K. which must 


contain elements not in K;. 


3. For the remainder of this paper G’ will denote a fixed non-abelian 

\ Mho 

Tg(E;) and E will denote the euclidean three-space (heretofore Z,). The 
subgroups K, and K,, as in the previous lemma, are themselves abelian T(L) 
‘ Each transformation group G carries with it, in a natural way, an effective factor 


group. See (5). 
°In general, if K denctes a subgroup of G and S a subset of R, K(S8) will denote 


the set of all points k(s) for k of K and s in 8. 


d 

b 

L 

a 

3 

is 

s] 

ne 

3. 

pe 

C0 
re 

K 

T 
(I 

ge 

tr 

TI 

ark 

the 

un 
un 


NON-ABELIAN COMPACT CONNECTED TRANSFORMATION GROUPS. Le 


and therefore circle-groups. From the fact that they are distinct subgroups 
of G’ we shall show that there is at least one point x of H such that the orbits 
K,(z) and K.(«) are distinct point sets. To do this we shall suppose the 
contrary—that for every x, K,(x) = and obtain a contradiction. 


3.1 Let A denote the set of points each of which is fixed under K,. This 
set is closed in # and homeomorphic to a line. Let x by any point in the 
complement of A. Then K,(x) = K.2(z) is a simple closed curve. Let gi 
denote an element of K, which is not in Kz. This element exists for, otherwise, 
K, would be a subgroup of K, and could not then be distinct from it since 
both are circle groups. Let y be the point gi(z). The point y is distinct from 
z because g, is not the identity and w# is not in A. Since yC K.(z) there is 
an element of such that g2.(y) =z. Then is an element g of G’ 
such that g(x) = 


3.2 We shall now show that g is the identity. Let F denote the points of 
E—A which are fixed points of g. This set is closed in the set 1! — A and 
is not vacuous, since it contains z If we can show that IF is also open in 
E—A we shall have accomplished our immediate object, for in this case, 
since H — A is both open and connected, it will follow that / must coincide 
with #H— A. On the other hand the fixed points under g form a closed set, 
necessarily, and this set must contain # = H—A=EH,. But if all points of 
F are fixed points of g, g is the identity. 


3.3 Let us show, then, that F is open. To this end, let z denote an arbitrary 
point of F. By our assumption in 3, K,(z) = K.(z). From this and the 
composition of g, K,(z) is invariant under g. This means that g may be 
regarded as a homeomorphic mapping of a simple closed curve into itself. But 
g has a fixed point, namely z. Furthermore g cannot reverse orientation on 
K,(z) = K.(z) since this would imply that either g, or g. is sense-reversing. 
This is impossible because each of them is an element of a connected group 
(K;,K, resp.). Finally, the further condition upon g that its powers must 
generate a compact group obliges it, as is well known, to be the identity as a 
transformation of K,(z). That is, g(z’) =~ for every point 7 of K,(z). 
Then, of course, every point of K,(z) is fixed under all powers of g. 

The last remark makes it clear that points sufficiently near to z are moved 
arbitrarily little by all powers of g and it is an immediate consequence of this 
that all orbits sufficiently close to K,(z) have on them at least one fixed point 
under g. But then, by the argument above, such orbits are pointwise fixed 
under g and, finally, the set F is open in E — A. 

9 


| 
1 
t 
1 


378 DEANE MONTGOMERY AND LEO ZIPPIN. 


3.4 Now, at last, if g is the identity of G’ it follows that g, is the inverse 
of g2 and this contradicts its choice as an element not in K>. 


4. Throughout most of the paper we shall be concerned with the group, 
we denote it by G, which is the closure of the group generated by K, and K,, 
It will transpire that G is identical with the original G’. 


4.1 No orbit G(x) can be three-dimensional, for if it were it would have to 
contain inner points (relative to ) and it would thereupon be open, since it 
it strongly homogeneous. But it is closed and so would coincide with F. This 
it cannot do, being compact. It follows that orbits G(x) are at most two- 
dimensional. We shall learn, much later, that these orbits, save one, are in 
fact two-spheres. For the moment, it is clear that if y is a point not on some 
G(x) there is at least one point z of G(x) which is accessible ® from y. It 
follows, from the strong homogeneity of G(x), that every point of that set is 
accessible from at least one point of G(y). Since G(y) is connected we may 
conclude that every point of G(x) is likewise accessible from y. This establishes: 


4.1’) G(x) ts the boundary of every complementary domain and each point 
of G(x) is accessible from each domain. 


This implies, of course, that no subset of G(x) can separate EF. 


4.2 We shall now introduce into / a (cylindrical *) codrdinate system appro- 
priate to the “rotation ” character of K,. In this (z,7r,6)-system the set of 
points which are fixed points under each transformation of K, constitute the 
axis A:r 0. This axis is the edge of the closed half-plane P which we may 
take as the “ initial plane ” 6 = 0 of our codrdinate system. Each orbit K,(2) 
intersects the half-plane P in precisely one point which we may call the initial 


point of the orbit. 


4,2’) We shall denote by 6{x} the single-valued continuous function defined 
everywhere in Z with point-values in P which maps each point into the initial 
point of its orbit. 

It may assist the reader if he will observe that the orbits K,(z) are actual 
circles * in our codrdinate system: if Z is the point (2, 7, 6) then K,(z) is the 
set z—=2,r—=#. The orbits K.(x) however are merely simple closed curves’ 
(topological circles). While we may interchange the rdles of K, and EK: if 


*T.e. there exists an are yz, yz- G(x) =z. 
7 We choose this for definiteness; we shall make only nominal use of the form of 
the codrdinate system and none, for the present, of the codrdinates themselves. 


§ The fixed points excepted, of course. 


| 
t 
t 
0 
C0 
Jn 
5, 
th 


nt 


NON-ABELIAN COMPACT CONNECTED TRANSFORMATION GROUPS. 379 


we wish, we cannot, at this moment, find a coérdinate system in which both 
sets of orbits are circles. 


4,8 Since the orbit G(a), for every point z, is invariant under K, it is clear 
that the set P, = P-G(zx) is identically the set 6{G(x)} and that G(z) 
=K,(P,) is a “ figure of rotation.” We know that G(x) is a homogeneous 
compact connected set. In the next section we shall prove that G(z) is locally 
connected. 


5. Let x denote an arbitrary point for which K2(a) is distinct from 
K,(z), section 3. We may suppose, without loss, that z is the initial point 
of its orbit, a point of P. Then 6{K.(x)} will contain z On the other hand, 
since K.(z) is not a subset of K,(a@) and certainly not a subset of A, K2(z) 
cannot be a subset of K,(2) + A (whether or not z is a point of A) and con- 
sequently 06{K.(x)} is not a subset of e+ A. It follows, from the fact that 
6{K.(x)} is a Peano space, that it must contain some arc yz (for some pair 
of points y and z) which has no point in A. Now K,(yz) must contain a 2-cell 
which we will denote by 8 (4). The two-cell S, by its construction, must be a 
subset of the orbit G(z). 


5.1 Let s denote an inner point of the two-cell 8. We propose to show, here, 
that G(x) is locally connected at s. If this is not the case then there must 
exist an open set O containing s and a sequence of points s, of G(x) such that 
lim s, = s and such that s and s, are in no connected subset of O-G(z). 
There exists a sequence of elements gn of G such that sy = gn(s) and such 
that the g,’s converge to the identity.2 Now let S’ denote a 2-cell subset of S 
which has s as inner point and which is contained in O. Almost all of the two- 
cells gn(S’) will lie in O and no two of these can have a point in common, 
by our choice of O. There is an arc ts which has s only on G(z), by 4. 1’. 
The two-cell S’ must have two “sides” in Z at the point s. Then it is clear 
that the arc ts must approach S’ from one of these sides while the sets gn(S’), 
which converge to S’, must approach it from the other. Since gn(s) is a point 
of gn(S’) it follows easily that for at least one large n, gn(ts) and S’ have a 
common point distinct from s. But this is a contradiction of the fact that 
gn(ts) and G(x) can have sp», only, in common. 


5.1’ Then we have shown that for every point z for which K,(z) # K2(z) 
the set G(x) is locally connected, and is two-dimensional. Then the set 


® Let gy’, denote a sequence such that s, —g’,(s). We may take it, from the 
compactness of G, that there is a g=limg’,. From the simultaneous continuity 
9(8) =s. Then g7(s) =s and (8) =s,. Let g, = 9',9": 


| 
F 
tO 
it 
is 
in 
1e 
It 
is 
Ay 
of 
he 
al 
o(] 
al 
al 
ne 


380 DEANE MONTGOMERY AND LEO ZIPPIN. 


P,=P+G(z) is locally connected, and we know that it is not a subset of 
A. 


5.1” Suppose, now, that y is a point for which K,(y) = K2(y). From the 
fact that every g of G is the limit of finite products of elements from K, and 
K, it follows, at once, that G(y) = K,(y). From this it is clear that P, is a 


point. 


5.2 We have shown now that P, is a Peano space and not a subset of A or 
that it is a point. We have seen, also, that in the first alternative at least one 
(and therefore every) point is interior to some two-cell of G(wz).*° It should 
be reasonably clear that G(x) = K,(P-~) is a point, a circle, a torus or a two- 
sphere (since it must be strongly homogeneous). We shall show, in the next 
section, that this is indeed the case. We shall then show that G(x), one point 


x excepted, is a two-sphere. 


6. Our analysis of G(x) will depend upon the a priori possible structure 
of P,. It is important to bear in mind that G(2) = K,(Pz) and that G(z) 
—A-G(z) is the product space of P; —A- Pz and K, (see 4). Where details 
are intuitively clear and technically elementary, they will be omitted. 


6.1 Suppose that P, contains an arc gr having g and r only on A. In this, 
the typical case, K,(qr) is a two-sphere and must coincide with G(x), by 4.1’. 


Furthermore, Pz = qr. 


6.2 Suppose that P, contains a simple closed curve C which has at least two 
points on A. Since C cannot be a subset of A we are led to the existence of an 
are gr, as above, and then to a contradiction. 


6.3 Suppose that P, contains a simple closed curve C having no point on A. 
In this case (which will later be shown to be impossible) K,(C) is a torus and 
must coincide with G(x). Furthermore, here, Pz = C. 


6.4 Suppose that P, contains a simple closed curve C which has precisely 
one point in common with A. In this case K,(C) is a pinch-torus which, 
separating space, must coincide with G(x). This contradicts the homogeneity 


of G(x) and is impossible. 
6.5 Suppose that P, is a tree one of whose endpoints is not in A. In this 


case it is clear that K,(P.) is a set at least one of whose points is not al 
interior point of a two-cell. This contradicts section 5.2 and is impossible. 


10 We are not asserting, although it is the case, that these cells are open in G(a). 


tree 
of o 
(ho: 
han 
is e¢ 
sphe 


4.3 


l 
| 
fr 
of 
un 
pri 
to 
| poi 
| 


10 


NON-ABELIAN COMPACT CONNECTED TRANSFORMATION GROUPS. 381 


6.6 Suppose that Pz is a tree not, degenerately, a single point. The tree 
must have at least two endpoints and, by the preceding discussion, both of 
these must lie in A. On the other hand, P, cannot be a subset of A. Then 
we are led either to contradiction or to the situation of section 6. 1. 

We may summarize this section in the following: 


6’) Pz is a point or it is an are with endpoints (and these only) on A or 
it is a simple closed curve with no point on A. 


6”) G(z) is a point or circle or it is a two-sphere or it is a torus. 


7. It is convenient to introduce at this point the continuous decom- 
position space #* of / in which each orbit G(x) is regarded as a “ point ” (4). 
This is a locally compact Peano space. We shall see that R* must be a ray 
its single endpoint corresponding to the unique fixed point (in Z) of G. It is 
clear that P may be regarded as a continuous decomposition space of # under 
the group K,. But, further, the space R* is a continuous decomposition space 
of P in which each set Pz is regarded as a point, namely the point G(z) 
=K,(Pz) of R*. The last is an immediate consequence of the preceding 
statements. 


7.1 It follows from this, but is also directly seen, that the points z of P for 
which P, is a point form a closed set and that in consequence the points y for 
which Py is a simple closed curve or an arc form an open and, as we know 
from sections 3 and 5.1’, non-vacuous subset of P. Now the “points” y* 
of £* corresponding to these points y must be cut-points of R* of order two 
precisely. For, the surfaces which correspond to them in # are invariant 


under G and separate # into precisely two domains. The “ points ” 
I 


y* com- 
prise all of the cut-points of R*. For, the other “ points” of R* correspond 
to points z of P for which P, is a point and G(z) is an invariant circle or 


point; in either case a non-separating subset of WV. 


7.2 It follows at once from the Cyclic-Element theory (see e.g. 1 and 3) 
that R*, being a Peano-space the set of whose cut-points is open, must be a 
tree. It must, furthermore, be a very simple tree since its cut-points are all 
of order two: in fact it can be an open curve (homeomorph of a line), a ray 
(homeomorph of a closed half-line), or an arc, and nothing else. On the other 
hand, each cutpoint of R* must separate it into two components of which one 
is compact and the other is not. For that is precisely what the corresponding 
sphere or torus does in Z. It is clear, then, that R* is a ray. 


1.3 Now the endpoint, let us call it f*, of the ray R* must correspond in Z 


A, 
nd 
ly 
+h, 
ity 
his 
(2). 


382 DEANE MONTGOMERY AND LEO ZIPPIN. 


to a point or to a circle, but in either event to a point f of P. This point f is 
the only point of P which satisfies the “ equation” P, =z. Now let a denote 
a point of A—f. Then Pa a, and Pa must be an arc ab where J, also, is 
in A, by 6’). We shall prove, quickly, that f is a point of A between a and b. 
Let mq denote a variable point of the arc ab of A and let it “ move” con- 
tinuously from a to b. In each position there is associated with it in a con- 
tinuous way a point m» which is the other endpoint of Pm,. AS ma moves 
towards b the point m, must advance towards a. This is immediately clear 
on a consideration of the half-plane P which is separated by each of these 
mutually exclusive arcs. It follows that in some position, ma = my and at 
this moment it must be the point f. 


7.4 Let & denote a definite one of the rays marked off on A by the point f. 
From the fact that F is closed and unbounded in F£ it is clear that the image 
of # in R* cannot belong to any compact subset of R*. This image, on the 
other hand, must contain f* and must be connected. It follows that the image 
of # covers all of R*. This implies, in particular, that all orbits G(x) with 
the exception of G(f) =f are two-spheres. Rather more than this is implied, 
however, for it is clear that R and #* are, abstractly, entirely equivalent. In 
other words, R may be regarded as the decomposition space of H under the 
group G where the mapping is the one which carries each point z of / into 
the unique intersection of G(z) with R. This will ultimately be seen to imply 
that the set of orbits is topologically equivalent to the family of spheres with 
center at the origin. 


8. Now that orbits under G are seen to be two-spheres, as we wished 
them, we shall consider how G acts upon them individually. Throughout this 
section S will denote a particular one of these two-spheres and § is a G(p) 
for some point p of R. Let 7, denote the subgroup of our original (’ con- 
sisting of all elements for which the point p is a fixed point. This group 
contains K,. We shall show that it coincides with K,. 


8.1 By section 4.1, G’(p) =G(p). Then, if x denotes a point of 8 the 
orbit 7p(x) is a subset of G’(z) = and contains K,(x). This last set isa 
simple closed curve, if we suppose that « is not the point p or that unique other 
point, call it 7, of R-S. Then it follows, precisely as in section 4.1, that 
T,(x) = K,(z). From this, by the argument of section 3.2, we conclude 
that 7, is identical with K,. Therefore the set of points of S fixed under an 
arbitrary element of 7’, the identity excepted, consists of the points p and P. 
For another point 2, the orbit T(x) is a simple closed curve. 


8.2 For the remainder of this section all symbols, z, y, z, etc., denoting 


¢ 
| is 
| 8 
i? 
0 
01 
8, 
a 
() 
ur 
le 
mi 
tic 
| 
an 
the 
| she 
! orb 

8, 
zh 
But 

Sef 

no 


NON-ABELIAN COMPACT CONNECTED TRANSFORMATION GROUPS. 383 


points will refer to points of 8 and the group G’ will be regarded as a Tg (Sz) 
where S, denotes the abstract two-dimensional sphere. 

Let 7, denote the subgroup of G’ leaving the point z fixed. Let y and g 
be such that g(z) = y. It is quite easy to see that Ty = gT 2g", the transform 
of T; by g. Then, if h is such that h(p) =z, T; =hT,h" and T, is a circle 
group. Let z denote h(p), p as above. Clearly, 7',(Z) =z. Furthermore, 
from the fact that p and ) only are fixed points of T’p it follows that z and Z 
are the only fixed points of 7’,. From the fact that Tp, = K, = T3 (cf. section 
8.1, replacing p by 7) it follows, at once, that T; Tz. The point Z will be 
called the conjugate of the point z. It is clear that the relation of conjugate 
is unique and reciprocal. Moreover, it is invariant under G’. For if Tz leaves 
z and £ invariant, then gTzg™ leaves gx and g& invariant. 


8.3 The transformation c(x) = 1s a homeomorphism of period two of 8 
into itself. Because of the remarks above, it suffices to prove the continuity 
of c(z). Let x,—> wz and let gn be a sequence of elements of G’ such that 
= ANd Jn — Go, the identity. Then, gn(Zn) = @, by the last remark 
of the preceding paragraph, and gn"'(“) = must converge to 


8.4 For a fixed 7’, all points, x and Z excepted, have simple closed curves as 
orbits. The space of all these orbits, including «= T7',(2) and ¢=T,(Z) is 
a closed interval and we may assign to each orbit T,(y) a real number 1, 
0St=1. This assignment is a continuous one, in a sense which is readily 
understood. We can now assign to each point y of S the same real number, 
let us call it a(y), which is associated with its orbit T,(y). We may “ nor- 
malize” so that a(x) =O and a(Z) =1. From the continuity of this func- 
tion it follows that if y is near z, a(y) is nearly 0 while a(y), in view of 
section 8.3, is nearly 1. Hence the function «#(z) —a(Z) is negative near z 
and positive for points z near to z It follows that for at least one point y, 
a(y) = and, for this y, T2(y) =T2(9). 

We have just seen that for every point x there is at least one orbit under 
the corresponding 7’, which contains at least one pair of conjugate points. We 
shall show that this orbit is uniquely determined. For the present any such 
orbit will be called an equator with the point z as pole and denoted by Q.. 


8.5 An equator, Q2, contains the conjugate of each of its points. For, let 
zbelong to Q2 = 7(y) for some y. There is a g in 7’, such that g(y) =z. 
But then g() = 2 and this point, too, belongs to Qz. But, furthermore, the 
set Q. divides S into two domains. The transformation c(z) = Z having 
no fixed points and leaving Qs invariant, as we have just seen, must inter- 
change these domains. Accordingly, the only points w for which it can be 


384 DEANE MONTGOMERY AND LEO ZIPPIN. 


true that T.(w) =T.2(W) are points of Qz. Therefore the equator Q, is 
indeed unique and it is completely characterized by the fact that it is a self- 
conjugate orbit of T,. Equivalently, the equator may be characterized as a 
self-conjugate simple closed curve invariant under Tz. 


8.6 The equator is an invariant of G’ in this sense: if x, w, and g are such 
that g(x) =w then gQo=Qw. For, first, 


Pw(gQe) = 9T sg" (9Qc) = = 


so that gQz is certainly invariant under T,,. Secondly, if z is a point of gQ), 
z==4g(y) for some y of Qz. Hence z= g(¥) is also in and, it follows, 
gQz is self-conjugate. Finally, it is clear that gQz is a simple closed curve. 
Therefore, by the preceding paragraph, gQz must be Qw. It is a corollary to 
this that an equator is set-wise invariant under any element which interchanges 
it’s “ poles.” For, if e is an element such that e(z2) = Z, then 


8.7 Now for any z there is at least one e such that e(x) =. For any such 
element, ¢(Z) =a and e?(x) =z. This element being in a connected group 
G@’ cannot reverse orientation on S but must interchange the domains deter- 
mined on S by the invariant Qz. Consequently it must reverse orientation on 
Q, and it must have a pair of fixed points. These must, of course, be mutually 
conjugate. Let y denote one of them. Then e(y) =y, so that e is in Ty 
Hence, since e(z) = @, it follows that x is in Qy for at least one y of Q2. But 
let z denote any other point of Q.. Then z—h(y) for some h in T, and 
Q-=hQ, > h(x) =a. Then we have shown that for every point z on Qz the 


“ nole” x lies on Q:. 


8.8 Here we shall show that if an element of Tz interchanges one pair of 
conjugate points of Qz it interchanges every pair on Qz and is of order two. 
Let g of T, and y of Qz be such that g(y) =¥. Then g? is in 7 and leaves 
y fixed. But in this case g? must be the identity of G’ by 1’. Now let the 
integer n and the element h of T', be such that h*” = g. Then 


gh(y) = =hg(y) = hy). 


This means that g does interchange those pairs of conjugates which are of the 
form h(y) and h(y) for n=0,1,2,---. But such points 
everywhere dense on Q, and our opening assertion follows from the continwlty 


of the conjugate. 


| 

| 
( 
a 
n¢ 
m 

th 
Le 

— an 
re 

re 

W 

poi 

twe 


NON-ABELIAN COMPACT CONNECTED TRANSFORMATION GROUPS. 385 


8.9 We shall see now that tf y and z are distinct non-conjugate points of Q» 
then Q@z = 2-+ From the last remark of section 8.7%, Qy- must con- 
tain z and Z. Suppose, then, that w is some point of this intersection. Let 
g denote an element of 7’, for which g(a) = @, and let h denote an element 
of T, for which h(x) =. Since hg(x) =~ it is clear that hg isin T,. On 
the other hand, h(w) = w and g(w) = w by section 8.8. Therefore, hg(w) 
=w. But if w is distinct from x and @ this is possible if and only if hg is 
the identity. Now in this case, y—hg(y) =h(y). Since h is an element 
of T, and y is distinct from z and from 7, y=h(y) implies that h is the 
identity. This contradiction to our choice of h concludes the argument. 


8.10 Let C be any simple closed curve orbit of Tz and let y be any point on 
Q:. Then C- Q, consists of precisely two points. For, since C separates « and 
< while Q, contains them, it follows that C’- Q, must contain at least two points 
aand b. There is some g in 7’, for which g(a) = 6b. The point b must lie in 
gQy, since Q, contains a. Then Q, and Qgy, = gQy have in common the point 
b which is certainly neither x nor z. But this can be reconciled with section 
8.9 only if g(y) =y or g(y) =y. The first alternative implies that g is the 
identity and is ruled out. The second alternative means, by section 8. 8, that 
g’ is the identity. Now since g is an element of the circle group 7’, and is of 
order two, it is wniquely determined. Then it is clear that the intersection 
(-@Qy cannot contain more than two points and it must consist of the points 


aand g(a) where g is the unique element of 7’, of order two. 


9. We are now in a position to set up on S and then to extend to / a 
codrdinate system suitable for our purpose. This “spherical” codrdinate 
system will be furnished by two appropriate subgroups which we shall call K 
and H, of G’. One of these we shall take as the group K,, section 8.1, and 
now designate it by K. It would be possible to take the other as K, but it is 
more convenient at this point to forget K,.. We shall continue to use the 
notations of the preceding section. 

Let q denote a definite point on the equator Q, = T'p(q) and let H denote 
the group T,. The group K = K, is 7,; both H and K are circle groups. 
Let be a symbol for a group parameter specifying elements of the group H 
and, similarly, let 6 be a symbol for elements of AK. These symbols may be 
regarded as real numbers ranging from zero to one, and each group may be 
regarded as the additive group of the corresponding numbers, modulo one. 
We shall restrict our # to the range: 0 = ¢ = 1/2 and denote the set of these 
¢’s by H*. Now H*(p) is an arc beginning at the point p and ending at the 
point 7 conjugate to p (section 8.2). The arc H*(p) is, of course, one of the 


two arcs pp on the equator Qy. 


l 
= 


386 DEANE MONTGOMERY AND LEO ZIPPIN. 


We determine two coérdinates @.. and ¢. of an arbitrary point w of S as 
follows: the point w belongs to the unique orbit K(w) and this meets the set 
H*(p) in precisely one point, by section 8.10. If we suppose that w is not 
p or there is a unique parameter 6, such that 6.(w) = K-Qg, 0S Ow <1, 
It is hoped that no confusion will arise here from our slightly ambiguous use 
of the parameter as an element of the group: we shall do this again with ¢. 
If w is p or P then 6 will be chosen as the parameter 0, corresponding to the 
identity of K. Now there is a unique parameter ¢ which we shall designate 
by 9S 51/2, such that = Ow(w): this parameter designates 
the unique element of H which carries p to the point 6.(w) of Qg. 

To sum up, we have associated with each point w of S a unique pair of 
coordinates 6, and $» such that, when we interpret these codrdinates as the 
elements of G’ with which they are associated, we have the following relation: 
pw (p) = wv. 

We can now extend this codrdinate system to the entire space EH by intro- 
ducing a third codrdinate p: 0 =p < o designating in a one-one continuous 
way the points of the axis R. The codrdinate p = 0 corresponds to the unique 
fixed point f (section 7.4) under G’. This last codrdinate does not, of course, 
correspond to any group-element of G’. Suppose, now, that w denotes an 
arbitrary point of #. If w is the point f, p= 0 and 6» and dw will be defined 
to be 0. If w is not f the orbit G’(w) is a two-sphere which has a unique 
point, call it py, on R. With this point there is associated a value of the 
coordinate p, call it pw. Finally we determine the codrdinates 6, and dx by 
the observation that the sphere G’(w) = G’(pw) is completely and (the points 
Pw and Fw excepted) uniquely covered by the set of points K{H*(p.)}. That 
this is indeed the case is immediately clear if we regard G’(p.) as the sphere 
S of section 8 and following. 


10. In this section we shall bring to a close the proof of our theorem. 
Let g denote an arbitrary element of G’. Let p be a point of R distinct from f 
and write g(p) =w. Now, g(p) = 9w*dw(p) and consequently, section 8. 1, 
9g 0w "dw is an element of K. But then g, hence g, is an element of that 
subgroup of G’ which is generated by the two groups H and K. Therefore, 
@’ itself must be the group generated by H and K. It is clear from this that, 
in the codrdinate system of section 9, G’ is the group of all “rigid rotations” 
of H; about the “ origin ” f. 


SMITH COLLEGE, AND 
QUEENS COLLEGE, NEw York. 


yl 
if 


NON-ABELIAN COMPACT CONNECTED TRANSFORMATION GROUPS. 387% 


BIBLIOGRAPHY. 


1. Kuratowski and Whyburn, “Sur les éléments cycliques et leurs applications,’ 
Fundamenta Mathematicae, vol. 16, pp. 305-331. 

2. Mayer and Thomas, “ Foundations of the theory of Lie groups,” Annals of 
Mathematics, vol. 36, pp. 770-822. 

3. Menger, Kurventheorie, Berlin, 1932. 

4. Montgomery and Zippin, ‘“ Periodic one-parameter groups in three-space,” T'rans- 
actions of the American Mathematical Society, vol. 40, pp. 24-36. 

5. Montgomery and Zippin, “ Compact abelian transformation groups,” Duke Mathe- 
matical Journal, June, 1938. 

6. Pontrjagin, “ The theory of topological commutative groups,” Annals of Mathe- 
matics, vol. 35, pp. 361-388. 

7. Pontrjagin, “Théorie des groupes continus,”’ Comptes Rendus, vol. 198, pp. 
238-240. 


& 

t 
t 
§ 
f 


A REMARK ON NORMAL EXTENSIONS.* 


By O. F. G. 


Let r be an arbitrary algebraic Riemann surface and G an arbitrary finite 
group. ‘Then there exist covering surfaces R of r such that the group of 
relative automorphisms of R& over r is isomorphic with G. The existence of 
such coverings R/r follows from Riemann’s existence theorem.” In this note 
we want to present an existence theorem for arbitrary abstract fields of alge- 
braic functions of one variable. The nature of our method, however, requires 
a restriction on the structure of the given finite group G. 

Let @ be a finite group which has the following properties. There exists 
a series of factor groups G; of G such that 


= Gi/Zi 


where 7; is a cyclic subgroup of Gi. We shall term G a pseudo-abelian group. 
Now let 9 (Vist), 9 (Wisi Vier) denote a fixed set of representatives of 
G, for the elements Uis1, Visi, Vier Of the factor group Gi/Zi = Gis. Then 


(Wisr) (Vier) = J (Wier Vier) 


where 2; is a generator of the cyclic group 4; of order m;. The exponents 
@(Uis1, Visx) are integers which are uniquely determined by the selected 
representatives g(Uis1), J(Vis1), 9 (Wisi Visor) and the generator 

Let k =Q(a,y) be a field of algebraic functions of one variable whose 
field of constants © is algebraically closed. Let x be the characteristic of ©. 
Suppose now that we already constructed a normal extension K of k whose 
Galois group is isomorphic with a given pseudo-abelian group G. The 
assumptions on G imply that K contains a series of normal subfields Ki// 
corresponding to the factor groups Gi. Moreover, Ki/Kis is a cyclic extension 
and its Galois group is equal to Z;. Thus, every field Kis: with the Galois 
group Gi: = Gi/Z; is imbedded in a normal field K; whose Galois group 18 
equal to Gi. 


TuEoreM 1. Let k—Q(2,y) be a function field of characteristic x and 


* Received October 24, 1938. 
1 Johnston Scholar at the Johns Hopkins University for the year 1938-1939. 
2A. Weil, “Généralisation des fonctions abéliennes, Journal de Liouville, vol. 11 


(1937), ser. 9. 
388 


( 
i 0 
it 
§] 
5 
a 
is 
a 
W 
wl 
if 
is 
d 
: Co 
} 
wh 
bez 
Dp. 


A REMARK ON NORMAL EXTENSIONS. 389 


Ga pseudo-abelian group whose order is relatively prime to x. There exist 
infinitely many normal extensions K over k whose Galois groups are isomorphic 
with G. 


Proof. We proceed by induction. Let G,—Z,~1 be the last factor 
group in a composition of the given pseudo-abelian group G. Since the order 
of G is relatively prime to the characteristic x of k we can find an infinitude 
of radical extensions 

K, = k(a'/"*) 


of degree m, over k. The Galois group of any one of these fields K,/k is 
isomorphic with G,—Z,. Let us take a particular one of these fields K, and 
suppose that we already constructed a normal extension Ki,,/k whose Galois 
group is isomorphic with Gis. = Gi/Z;. We want to prove the existence of 
anormal extension K; which contains K;,, and whose Galois group over k is 
isomorphic with G;. We express this problem in terms of the theory of linear 
algebras. The relations 


(Uisr) (Vier) = (Wiss Vier) 


which define the group G; in terms of 7; and Gi, give rise to a factor set of 
Kj,, relative to k. We define the factor set by 


where £; denotes a fixed primitive m;-th root of unity. A theorem of R. Brauer 
vields that K;,, can be imbedded in a normal field K; in the prescribed fashion 
if the algebra 

Ai = (Kiss /k, 


isa full matric algebra over k.* Since the field of constants is algebraically 
closed, it follows that 
A,~k.4 


Consequently the existence of a normal field K; is established. 


Corottary. In particular, there exist infinitely many fields K over k 
whose Galois groups are groups of order I" where (l,x) =1.° 


*R. Brauer, “ Uber die Konstruktion der Schiefkérper, die von endlichen Rang in 
bezug auf ein gegebenes Zentrum sind,” Crelle, vol. 168 (1932). 

‘C. C. Tsen, “ Divisionsalgebren tiber Funktionenkérpern,” Géttinger Nachrichten 
(1934). 

* A. Speiser, “ Theorie der Gruppen endlicher Ordnung ” (2nd edition), Theorem 78, 
p. 69, 


of 
of 
te 
e- 
ts 
p. 
| 
k 
is 
| 
17 


390 O. F. G. SCHILLING. 


We now want to prove a refinement to Theorem 1. Let G be a group of 
order 1”. Then the totality of elements g,929:‘g2’*, where 41, 92€4, 
generates an invariant subgroup G* of G. The factor group G/G* is an 
abelian group of type (/,1,- --,1); let J" be its order. Then n is equal to 
the least number of generators of the group G. 

Let K be a normal extension of k whose Galois group is equal to G. 
Then there belongs to G* C G a subfield K* of K which is the join of all 
cyclic subfields Z; of degree J over k. We can build up K over & in the 
following fashion: 


where the fields Kj are normal over k and Kj, are cyclic extensions of degree 


l over 


Lemma 1. Let L be a normal extension of degree I” over k = Q(z,y) 
and suppose that some prime divisor of k is completely ramified in L. Then 
L is a cyclic extension of k provided that (l,x) =1. 


Proof. Let p be the completely ramified prime divisor of &. Then 
p—$'" in Z. The ramification theory implies that the ramification group 
of % coincides with the full Galois group G of Z over k. Consequently the 
Galois group of the complete field Ly over k, is equal to G. But Ly =k, (x!) 
where vp(7) =1,7ek,. Namely (/,x) =1 according to assumption. Hence 
G is cyclic. 


LemMaA 2. If L=k(a/') and a=Tp;* then the discriminant of L 
over k contains exactly those prime divisors p; for which vp,(a) *0(l). 


Proof. The discriminant of L over k is a divisor of the discriminant of 
the defining equation =0. The latter is equal to (— 1) 
Since (l,x) —1 it follows that at most the prime factors p; of a can be 
ramified in L. Let p be such a prime divisor and kp the associated p-adically 
closed field of k. Then 


LX kp=k, +:--+h, if V,(a) =0(1) 
and 
LX if vp(a) 


Both assertions immediatey follow from the ramification theory. Thus the 
Lemma is proved. 

Let r(1) be the number of divisor classes of degree 0 in & whose exponents 
are equal to 1. The number 7r(/) is finite as a simple algebraico-geometric 


f 


th 


ac 


de} 


( 
| 


he 


ric 


A REMARK ON NORMAL EXTENSIONS. 391 


argument yields.° The maximal unramified abelian extension of exponent / 
over k has then the degree 17), 

Now let p:,° - +, pr be a finite number of prime divisors of k. We want 
to determine the degree of the maximal abelian extension K: over k, 


whose exponent is equal to /, and 
which is ramified at most at ),,° °°, Dr. 


LemMa 3. The maximal abelian extension Ki which has exponent | and 
is ramified at most at +, Pr has degree I" over k. 


Proof. ‘The arithmetical theory of radical extensions yields that K is the 
join of cyclic extensions 7; = k(a;'/').. The numbers a; « & are described by 
the property that 

Up, (ai) AO(1) 


for at most }:,:--,)r. (See Lemma 2.) Hence we have to determine the 
index [a : b'] where 
Vp, (a) FO0(1), (j =1,° 
a are not /-th powers, and 
b runs over all elements different from 0 in k. 


Let 2% be the group of all divisors in & and ¢ the group of all numbers in & 
for which 
Vp(c) =O0(1) for all p of k. 


Moreover let 2{’ = {a’} the group of all divisors in & for which at most 


Vp,(a’) = 0(1) for pj {P1, pr}. 
Then 
[a: be] (WM: M][c: bt] (MW: J7[M : 


according to the principles of reduction which are familiar in class field 
theory.’ Therefore 


according to the structure of the group %. Moreover, 


°O. F. G. Schilling, “ Foundations of an abstract theory of abelian functions,” 
American Journal of Mathematics, vol. 61 (1939). 

"H. Hasse, “ Bericht iiber neuere Untersuchungen und Probleme aus der Theorie 
der algebraischen Zahlkérper ” (Leipzig, 1930), Part Ia, §§ 15, 16. 


of 
J) 
) 
ce 
of 
be 


392 0. F. G. SCHILLING. 


for the same reason as before. Consequently, 
[os OF] 


and the a; can be chosen as a complete set of representatives of the factor 


group c/b’. 


THEOREM 2. Let G be a group of order 1” whose least number of 
generators 1s equal to n. There always exist normal extensions K over k 
whose Galois group is isomorphic with G provided that nS=r+7(l) —1. 
The necessary r branch points can be prescribed arbitrarily. 


Proof. Let 7 branch points p,,---,p, be chosen such that nS=r-+7(1)—1. 
Suppose moreover that r—n-—vr(l) +1. We want to construct a normal 
extension K over k which is at most ramified at p.,---+,r. According to 
Lemma 3 we can construct an abelian extension Ki/k of exponent 1 which is 
at most ramified at p,,---,p,. We find that [Ki : k] =17*7™-1, In order 
to construct the field K we put K* = K;. Then we get the right number of 
generators. Theorem 1 implies the existence of fields Kj; such that 


CeCe. 


and such that K has the right Galois group over k. In order to prove that 
at most p,,- - *,), are ramified in K too, we again use induction. Suppose 
we already constructed a normal extension K;/k. We want to show that 
Kj,, can be chosen such as to have no other branch points than the given 
pi, °,p,r. Suppose now that p is a branch point of some constructed 
imbedding field Kj,:; which does not occur amongst the given set p1,° * *, Pr 
Lemma 3 yields the existence of a cyclic field Z of degree 1 over & which is 
ramified at p and at least at one of the pj. Namely, we only have to consider 
the maximal field K’; belonging to p,p:,---+,pr. The field Z is surely not 
contained in K*. Consequently the join Kj,,Z is an abelian extension of 
degree 1? over K;. The Galois group of Kj,:Z over Kj; has type (1,1) and the 
Galois group of Kj,:Z over k is equal to the direct product of Gj,1 and a cyclic 
group of order J. Since p does not divide the discriminant of Kj; over k 
we have 


p= B*,- - 


in K; where [Kj : k] =I*. A prime divisor P*; of K; decomposes in K jh 


as follows: 


I 
é e 
n 
W 
a 
tic 
4 sp 
4 sh 
4 
to 
ass 
are 
ove 
for 
ben 


m 


cor 


€ 


A REMARK ON NORMAL EXTENSIONS. 393 


Namely $*; — 8," would imply according to Lemma 2 that Kj,:Z is a cyclic 
extension of degree /? over Kj. 

The theory of the decomposition groups of $*; relative to Kj,Z and Kj; 
implies that $8*; is totally decomposed in one of the fields L(+ Kj.1Z, Kj) 
lying between K;,,Z7 and K;. The structure of the Galois group of Kj.Z 
over & yields that the Galois group of ZL over k is isomorphic with Gj,,.° 
Moreover p is not ramified in L for 8*; is not ramified in L, and LZ is a normal 
field. Repeating this process a finite number of times we finally arrive at a 
field L = K’;,, which is at most ramified at p,,---,pr. Since p1,° °°, Dr 
are ramified in K* the field K which we ultimately obtain, has the desired 
properties. 

It is now obvious that one can also construct normal fields K which are 
exactly ramified at - -, pr provided that n=r-+7(1) —1. 


Remark 1. If Q is an algebraically closed field of characteristic 0 then 
1(l) = 2p where p denotes the genus of k. Thus Theorem 3 yields 
n= 2p -+r—1 which is the formula obtained in the classical theory. 


Remark 2. It is not difficult to compute the exact number of fields K 
which are exactly ramified at given prime divisors ),,° - -,)r. The necessary 
calculations are similar to the ones given by Witt.° 


Remark 3. If k =Q(«) then we must have at least n + 1 branch points. 


Let be a finite Galois field of gx’ elements. Suppose moreover that 
2 contains the roots of unity, i.e., g==1 (modl). Under these assump- 
tions it is possible to generalize Theorem 1. However, it is necessary to impose 
special conditions upon the prime divisors ),,- of k=Q(z,y) which 
shall be ramified in the normal extension K/k. Namely, in order to be able 
to apply the theory of algebras, it is necessary to insure by appropriate 
assumptions on the given prime divisors ); that the algebras 


Ai — (Ki/k, Gi, ) 


are similar to k. Since the norm theorem is valid for the cyclic extensions 
over k, it must be proven that 
Ai 4 ky, kp, 


for every p;. The discriminant of A; contains at most the ramified prime 

*H. Reichardt, “ Konstruktion von Zahlkérpern mit gegebener Galoisgruppe von 
Primzahlpotenzordnung,” Celle, vol. 177 (1937). 

*E. Witt, “Konstruktion von galoisschen Kérpern der Charakteristik p mit gege- 
bener Gruppe der Ordnung p’,” Crelle, vol. 174 (1936). 


10 


f 
r 
t 
f 
e 
k 


394 0. F. G. SCHILLING. 


divisors of K;/k, the factor set giving no contribution.’° In order to insure 
Ai X ky, ~ ky, it suffices to suppose that the residue class fields of the dif. 
ferent p; contain sufficiently many /?-th roots of unity. In other words, we 
must suppose that q/'?s) =1(/?), where f(p;) denotes the absolute degree of 
p;. This condition can always be realized. Moreover, we shall require that 
the maximal abelian subfield K* of the field K to be constructed be completely 
ramified. This means that r—n. Having satisfied these conditions, we are 
able to state the following existence theorem whose proof can be taken over 
verbatim from the theory of algebraic number fields.” 


THEOREM 3. Let G be a finite group of order I" whose least number 
of generators is equal to n. Suppose that there are gwen n prime divisors 
Pn of k =Q(a,y) whose absolute degrees f(pi) are sufficiently larg 
so as to insure qf) =1(1"). There exists a normal extension K over k 
whose Galois group is isomorphic with G and which is ramified exactly at the 


given n prime dwwisors. 


Remark. Let I" be the degree of the maximal abelian unramified 
extension of exponent / over & which does not contain the cyclic extension of 
degree J over ©. A careful analysis of the proof of Theorem 3 yields that the 
theorem on unramified normal extensions generalizes. Namely, if the number 
of generators of the given group G is less or equal than +(1) then there exist 
normal unramified fixed K whose Galois group is isomorphic with G. 

We remark that the existence of a normal field K with the Galois group 
G over k yields in general no information on the nature of the ramification. 
Namely, the theorem on the structure of algebras which we used is a sufficient 
(but not necessary:) condition. 

We next want to discuss the case of groups G whose orders 1” are powers 
of the characteristic y. The general existence proof for such groups was givel 
by E. Witt. 

Let kp be the p-adic closure of k =Q(a,y) with regard to the prime 
divisor k. Then every cyclic extension Z of ky of degree x is given by the 
root of an equation 

= Uy. 


10H, Reichardt, “Die Diskriminante einer normalen einfachen Algebra,” Crellé, 
vol. 173 (1935). 

117, Tannaka, “Zyklische Zerfillungskérper der einfachen Ringe tiber dem alge: 
braischen Funktionenkérper,” Sci. Rep. Téhoku Univ., vol. XXIV (1935). 

T. Tannaka, Uber die Konstruktion der galoisschen Kérper mit vorgegebent 
p-Gruppe. Té6hoku Mathematical Journal, vol. 43 (1937). 


at 
| 
| 
t 
a 


fied 
1 of 
the 
xist 


oup 
jon. 
‘jent 


vers 
iven 


the 


rele, 
alge: 


pener 


A REMARK ON NORMAL EXTENSIONS. 395 


Let dy = or -+-- - - where we Q, =1 andA=O. Since is supposed 
to be algebraically closed we always get A > 0, AS40(x). The discriminant 
of Z=ky(z) is equal to p(*-)), Consequently there exist infinitely many 
different cyclic extensions 7 of degree x over ky. Therefore the index 
{a : bX —b] is infinite. Applying Witt’s result we get the following theorem. 


THEOREM 4. Let G be a group of order x" and ky a complete discrete 
field of characteristic x. There exist infinitely many normal extensions K 
over ky whose Galois groups are equal to G. 


Remark. Suppose that © is a finite Galois field. If we require that the 
discriminants of the extensions K over ky are bounded, we obtain restrictions 
on the number of generators n of G. Moreover, one finds that there exists 
only a finite number of such fields. The computations are simple and similar 
to the ones carried out by Witt. The actual computations show that the 
number of generators depends on the given bound and the number of elements 
in Q. 

In order to determine the totality of all cyclic extensions 7 of degree x 
over k =Q(2,y) which are at most ramified at a finite number of prime 
divisors -,)r, we have to consider the index [a’: —b]. There a’ 
denotes the additive group of all elements a’ in k for which 


Vp, FO(x) and vp,(a’) < 0, 


Witt’s results imply that the index [a’ : b¥ — 6] is infinite as soon as r= 1. 


THEorEM 5. Let G be a group of order x". There exist infinitely many 
normal extensions K of k =Q(a,y) whose Galois groups are isomorphic with 
G and which are ramified at a single prime divisor p of k. 


Proof. Let again G* be the subgroup of G which is generated by all 
elements of the form 99291 92%". Then G/G* is an abelian group of order 
x" and type (y,x,° °*,x). Let p be the given prime divisor. Then there 
exists at least one abelian extension K* over k which is at most ramified at p. 
The field K we have to construct is determined by a chain of cyclic extensions 
Kj,/K; of degree 


According to Witt there exist fields K; such that K has the right Galois group 
over k, A relatively simple argument using Theorem 3 and the theory of 
divisor classes yields that the extensions K; can be chosen such that no new 


re 
if- 
we 
hat 
ely 
are 
ver 
ber 
rge 
rk 
the 


396 O. F. G. SCHILLING. 


ramified prime divisors except the divisors of p are added. The classes to be 
considered are ($j*), ¢ > 0, where $j | p in Kj. 


vemark 1. Since k& contains only a finite number of divisor classes of 
degree 0 and exponent x the number of generators n of G is limited if we want 
to construct unramified extensions K/k.’* One actually can exhibit examples 
of fields k of genus p for which r(x) =1. It is never possible to construct 
unramified extensions K/k whose Galois groups have order y™. 


Remark 2. Let k =Q(z) be a rational function field and G a group of 
order x”. There exist always infinitely many normal fields K/k whose Galois 
groups are isomorphic with G and which are ramified at a single prescribed 
prime divisor p of k. If we impose the condition that the discriminant of 
K/k is bounded and that © is finite, then we can again find relations between 
the number of generators and the bound. The number of different fields can 
be determined ; it is finite. 


Remark 3. If Q is a finite Galois field of g =x’ elements, then most of 
the preceding results can be carried over mutatis mutandis. It turns out that 
the number q and the genus p of the underlying field & have to be considered. 
But we do not insist on an explicit computation of the respective formulae.” 

Finally we remark that the preceding theorems when compared with the 
theory of algebraic number fields, indicate the close relationship between the 
genus of our function fields and the class number of an algebraic number field. 


THE JOHNS HOPKINS UNIVERSITY. 


12H. Hasse and E. Witt, “ Zyklische unverzweigte Erweiterungskérper vom Prim- 
zahlgrade p iiber einem algebraischen Funktionenkérper der Charakteristik p.,” Monats 
hefte fiir Mathematik und Physik, vol. 43 (1936). 

18, Witt, “Der Existenzsatz fiir abelsche Funktionenkérper,” Crelle, vol. 173 


(1935). 


| 
é 
( 
] 
| 
] 
| 
t! 


m- 
ts- 


BOREL SUMMABILITY AND LAMBERT SERIES.* 


By Tomuinson Fort. 


There are several theorems in the theory of convergent series of which the 
following * is an example. 


If an(z) and b,(z), n= 0,1, 2,- - -, are defined over a set P and if 
(1) (2) 


converges uniformly over P and 


| (z) — SG, 


n=0 


a constant, and | bo(z)| < 6, a constant; then 


an(2) bn (2) 
converges uniformly over P. “ 
We shall refer to these theorems as introductory theorems. They are a 
ready consequence of the formula for summation by parts, 


n n n n 
(2) UnWn = Unsi = (Aun) Wn. 
n=k n=k n=k 
In the present paper theorems analogous to introductory theorems are 
proved for Borel Integral summability. The theorems which are proved are 
applied to Lambert series. 


1. Borel’s integral definition.? We shall prove the following theorem: 


TueoreM Jf an(z) and b,(z), n=0,1,2,- are defined at all 
points of a set P, bn(z) > b(z) where | b(z)| < B and if (1) is uniformly 
Borel integral summable over P ; if, moreover, there exists a function f(n, 2) 0 
when z is of P such that 


* Received September 12, 1938. 

*The names Abel, Dedekind, Dirichlet and du Bois-Reymond have been attached to 
theorems of this type. See Knopp, Theorie und Anwendung der unendlichen Rethen, 
8. 316; Bromwich, Theory of Infinite Series (Second Edition), p. 246; Fort, Infinite 
Series, p. 106. 

*Legons sur les séries divergentes, ed. 1901, p. 98. 


397 


be 
of 
‘ant 
ples 
uct 
) of 
lois 
bed 
of 
een 
call 
of 
hat 
ed. 
18 
the 
the 
Id. 
73 


398 TOMLINSON FORT. 


(3) | Aba(z)| | f(n, 


converges * uniformly in z over P to a bounded function and such that 


1 % n 
(4) 


is bounded in n and z and such that * 


converges umformly in n and z; then 
(5) (2) bn (2) 
is uniformly Borel integral summable over P. 


Denote the Borel integral formed for (1) by s. Then 


(6) s— ba (z) an (2) = da. 


n=0 


Now in (2) let k=0, and wy and substitute in (6). 


We get 
(7) lim [ (2)e* an (2) 
0 CO n=0 nN: 
— (Ads (2) )e* an (2) =| da. 


Denote the integrand in (6) by g(a,z). We have 


k’ 
| 
k 


Since (4) is bounded in n, a and z and since (3) converges uniformly 
in z over P, the infinite series under the second integral sign in the right-hand 
member converges uniformly in @ by the Weierstrass® test, and can be 1- 


tegrated term by term. Consequently, we can write 


k’ an 
<B| f an (z) — da 
! 


n=0 


k’ © n n | 
k n=0 n=0 ns | 


Ab, (2) =5b,,, (2) (2). 
‘Throughout this paper a is real. 
5 Fort, Infinite Series, Theorem 113. 


n=0 
Ig 
‘a 
if 
ve 
| | : 
bat 
int 
| wh 


BOREL SUMMABILITY AND LAMBERT SERIES. 399 


n 
<B| f an (2) da 
| k n! 


n=0 


We have assumed that (1) is uniformly Borel integral summable over P, 
that is, that 


e~* an(z) — da 
0 n=0 n! 


is a uniformly convergent integral. Under this assumption given any y > 0 
choose k so large that when k’ > k > K, both 


k’ (oe) Qn 
k n=0 nN; 
and 
k’ 1 n qn 
| f, 2) * an(z) nl < 7. 
Then 


| 


Since by hypothesis 


| 


k’ 
| 
k 
ifn <«/(B+B’). 


Under the assumptions made we have thus established the uniform con- 


is bounded 


<7(B+ B’) <e 


vergence of the integral (6) and so the uniform Borel integral summability 


of (5) and so Theorem I. 


2. Absolute summability.° Theorem 1 admits of an immediate analogue 
for absolute Borel integral summability. Proof is omitted to avoid repetition. 


THEOREM II. Jf an(z) and are defined at all points of P, 
bn(z) > b(z) where | b(z)| < Band if (1) is absolutely and uniformly Borel 
integral summable over P, moreover, if there exists a function f(n,z,rA) AO 
when z is of P such that 


| Aba(z)| | f(n.z,a)| 


n=0 


* Borel, loc. cit., p. 99. 


n=0 
| 
aly 
nd 
jn- 


400 TOMLINSON FORT. 


converges uniformly over P to a bounded function for each X and such that 


1 n 
f(n, Zz, X) 2 an (2) (n—a)! r)! 
is bounded im n, z and « for each » and 


* 00 1 

is uniformly convergent in n and z over P for each 2, then (5) is absolutely 
and uniformly Borel integral summable over P. 


3. Lambert and Power Series. Consider the power series 


Me 


(8) dn2". 


Denote the interior of the Borel polygon of summability’ of (8) by S. 
We shall proceed to prove the following theorem. 


0 


3 
u 


THEOREM III. Let T be the radius of convergence of (8). Let T’ <T 
and let f(n,z,A) =| 2/T’ |"if |z| > T and f(n,2,r4) =1 if |2| ST. Then 
serves (8) fulfills all the conditions required of the a-series in Theorem II 
so long as z is restricted to a subregion R of S lying in the finite plane and 
with boundary always distant from the boundary of S by «. 


Let 
and let 
= Gar 


Borel proved ® that when z lies in R. 
(10) lan | < k(A) 


where k(X) is independent of # and z. We proceed to examine 


gnqn- -r 
Sn(@, A) = (n—a)! 
We write 
(11) Gr = Sn (a, Z,A) + 1n(@, 2, A). 


® Borel, loc. cit., p. 124. 
7 Borel, loc. cit., p. 128. 


q 

0 
b 
a | 
b, 
£4 
| 4 

th 

| 


BOREL SUMMABILITY AND LAMBERT SERIES. 401 


Draw a simple closed curve, c, lying wholly within the polygon of summability, 
enclosing # and distant from the boundary of the polygon of summability 
by «and from the boundary of # by 6. Let L be the length of c. Then ® 


(12) 1'n(@, 2,A) 
From this and from (10) 


(13) | (a, 2, A) | S (A) e(1-)aK 
Hence by (10), (11) and (13) 

(14) | S K(A) (14 ea 2K i 
(15) | sn (a, 2,A)| S 2K 


if|z| <7’. The theorem follows from (10), (14) and (15). 
We now consider the Lambert Series 


(16) > dn 


It can readily be proved by the introductory theorems that this converges at all 
points at which (8) converges for which | z | ~ | p| and absolutely uniformly 
over any finite region lying within the circle of convergence of (8) and whose 
boundary remains a distance § from the circle of convergence and the circle 


l7|=|p|. 
We proceed to examine the conditions imposed on the b-series by Theorem 
II when b, = p"/(p"— 2"). We announce the following theorem. 


THEOREM IV. If T is the radius of convergence of (8) and T’ < T and 
f(m,z,A) |2|"/T" if |2z|>T and f(n,z,A) 1 if ST, then 
bn = p"/(p"— 2") fulfills all the conditions imposed on the b-series by 
Theorem II if z is at least distant from the circle | z| =| p| by « and either 
ST’ or |z| <pand < | pT’ | simultaneously. 


This theorem is readily proved by means of the test-radio test of the ele- 
mentary theory of series. We, consequently, are able to state the following 
theorem. 


*See for example, Osgood, Funktionentheorie (zweite Auflage), S. 316. 


| 
en co pra” 
| 


402 TOMLINSON FORT. 


TuEoREM V. Series (16) is absolutely summable by the Borel integral 
method at all points at which series (8) is absolutely Borel integral summable 
provided |z| |p| and either |z| <T, the radius of convergence of (8), 
or|z2|<|p| and <| simultaneously. It is uniformly summable 
over any bounded region within the interior of the polygon of summability of 
(8) such that |z|S|p|—eand |2| <|p|(L—e) simultaneously at all 
points of the region. 

It is to be remarked that if | z| << T and |z| |p|, (16) is absolutely 


convergent and hence absolutely summable. 
It is further to be remarked that on account of the symmetry of (16) 


any theorem relative to z,p is also true relative to p, 2. 


LEHIGH UNIVERSITY. 


i 
( 
( 
: 
4 t 
t 
a 
| 
| 
| 


SINGULAR MAPS OF DIFFERENTIABLE MANIFOLDS OF N 
DIMENSIONS INTO N DIMENSIONAL 
EUCLIDEAN SPACE.* 


By T. Y. THomas. 


1, Let M be a separable and compact differentiable manifold of n (= 2) 
dimensions and of class C’ (r=1). It can be shown that it is possible to 
define a kiemann metric of class C’-* in M, i.e., it is possible to define a posi- 
tive definite quadratic differential form over M with coefficients which are 
functions of class C’-* of the codrdinates of the allowable coérdinate neighbor- 
hoods in 

Let g be a fixed Riemann metric of class C’* in M which exists in 
accordance with the above result. When it is desired to measure distances 
independently of codrdinate systems this metric may be used although within 
the separate codrdinate systems it may sometimes be of advantage to employ 
the ordinary Euclidean metric. 

Let f be a map of class C" of M into the Euclidean space Em. Let S be 
the set of all such maps f. We may define a metric in S in the following 
manner: Let # and wy be two elements of S. Put 


p(x) ] = [b*a— Ya] 


where it is to be understood that Latin indices have the range 1,- - -,m and 
that Greek indices have the range 1,- - -, and both sets of indices are to 
be summed when repeated in accordance with the usual convention. Note also 
that d,,- - -,d, involve successive covariant derivatives of the functions ¢4 
and y* and that d, involves the r-th covariant derivative which is the highest 
covariant derivative that can be formed under the hypothesis that the Riemann 
metric is of class C*-? and the maps ¢ and y are of class C’. Put 


* Received October 6, 1938. 
*H. Whitney, “ Analytic coédrdinate systems and ares in a manifold,” Annals of 
Mathematics, vol. 38 (1937), p. 816. 
403 


al 

le 

le 

of 


404 T. Y. THOMAS. 


Then D[¢(x), (x) | is a continuous function on M. Since M is compact and 
separable it is bicompact and hence the function D[¢(z),y(z)] assumes its 
maximum value at a point of M. Denote this maximum value by D(¢,y) 
and define D(¢, y) as the distance between the points ¢ and y of 8. Evidently 
D(¢,y~) = 0. In fact it is easily seen that with this definition of distance § 
is a metric space. 

The map space S is complete, i.e. every Cauchy sequence in § converges. 
The proof is simple and will be omitted.? 

The map or point ¢ in S will be said to be regular if the matrix || 0o+/dz: | 
has rank n at every point of M. Otherwise ¢ will be said to be singular and 
the points at which the above matrix has rank less than n will be called singular 
points of the map. The set of all such singular points will be called the domain 
of singularity of the map ¢. 

It is easily seen that the regular maps in S form an open set. On the 
basis of methods used by H. Whitney * the following results can be proved: 
(a) if r=2 and m= 2n the regular maps are dense in 8 and (b) if r=2 
and m = 2n + 1 the regular topological maps form an open and dense set in 8. 
Since the map f = 0 is an element of S it follows that for r= 2 and m = 2n 
the set of regular maps in S is non-vacuous, and for r= 2 and m= 2n+1 
the set of regular topological maps is non-vacuous in 8. 


2. In the following we take m =n. Denote by U the cube —1S2*=1 
in HZ, and its interior, namely —1 < 2* <1, by U. Similarly let U’ be the 
cube — 4 S 2* < } with interior — } < 2* < } denoted by U’. It is possible 
to construct a function H(2’,- - -,2") of class C® in EF, possessing the fol- 
lowing properties: 

(a) H=1 in VU’, 

(b) H=0 in L,—U, 

(c) O< H<1iin 
(d) H is analytic in U—U’. 


On the basis of these properties the following lemma can be proved. 


Lemma 1. The function H can not be homogeneous of degree —1 in 
g 


an open set in U. 


2A complete metric map space has been used in purely topological embedding 
problems by W. Hurewicz, “Uber Abbildungen von endlichdimensionalen Raiimen auf 
Teilmengen cartesischer Raiime,”’ Preuss. Akad. Wissenschaften, Berlin, vol. 24 (1933), 
p. 754. 
*H. Whitney, “ Differentiable manifolds,” Annals of Mathematics, vol. 37 (1936), 
p. 645. 


I 

( 

8 
I 
al 
at 
A 
D 
Si 
in 
in 
de 
set 
a 
tai 
in 


SINGULAR MAPS OF DIFFERENTIABLE MANIFOLDS. 405 


Proof. Assume H homogeneous of degree —1 in an open set VC U. 
Then V can not lie entirely in U’ since H = 1 in U’ by (a). Hence V must 
contain a point in U —U’ and hence an open set W such that WC V and 
WCU—VU’. By assumption H is then homogeneous of degree —1 in W. 
Let «* = &¢ with the s constant and t= 0 define a straight line / issuing 
from the origin in #, and passing through a point of W. Then by our 
assumption H = c/t along / in W where ¢ is a suitably chosen constant. From 
(c) we must have c > 0. Since H is analytic in U — U’ by (d) it follows that 
H must be represented by ¢/t along the portion of J contained in U —U’. From 
this fact and the continuity of H in F, it follows that H ~0 at the inter- 
section of / with the boundary of U. As this is in contradiction with (b) the 
proof is complete. 


3. Let ¢',- - -,¢" be a set of n functions of class C™ (r=1) in U. 
Consider the functions y = ¢%* + Hzxzg* in U where the z’s are arbitrary con- 
stants. Also consider the functional determinant A = | d¥“/dz7| in U. Let 
J denote the set of points in U at which A vanishes identically in the zg*. 


LemMaA 2. The set J is nowhere dense in U. 


Proof. We have 
0H 
B B a 


If at a point PC U the determinant A’ of the coefficients of the z’s in the 
above representation of A is different from zero we can choose the 2’s so that 
at P the elements of A can have arbitrary values, for example the values 8,*. 
At such a point P the determinant A will not vanish identically in the 2’s. 
Denote by J’ the set of points in U at which A’ is zero. We have 


| +: (H+), 


Since H ~ 0 in U it follows that A’ = 0 if, and only if, the above expression 
in the parenthesis is equal to zero. By Lemma 1 this expression can not vanish 
in an open set W C U since this is the condition for H/ to be homogeneous of 
degree —1 in W. Hence A’ can not vanish in an open set in U. Any open 
set WCU must therefore contain a point Q at which A’ ~ 0 and, since A’ is 
a continuous function, there must be a neighborhood W(Q) C W which con- 
tains no point of J’, i.e. J’ is nowhere dense in U. Hence J is nowhere dense 
in U since J C J’. 


4. Let R be the set of rational points in U, i.e. the points in U having 


406 T. Y. THOMAS. 


rational coordinates. Consider the intersection R-J of R with the nowhere 
dense set J. Then K = R—K-J is dense in U and countably infinite. Let 
P,, denote the points of K. 

To indicate the dependence of the determinant A on the selection of the 
constants zg* let us write A = A(zg*). Let O be any open set in Hy. Since 
A(zg*) 0 in the 2’s at P, we can find a set of points zg;; C O such that 
A(2%g/1) #0 at P;. Hence there exist open sets W(2g/1) with enclosures 
W (zp)1) C O such that A(zg*) 0 at P, for zg C W (zg). Similarly we can 
find points zg/2 C W(2g/1) such that A(z%g/2) 40 at P. and then we can find 
open sets W(zg/2) with enclosures W(zg/2) C W(zg/1) such that A(zg*) 40 
at for zg C W(zg/2). Hence A(zg*) 40 at P, and for zg C 
Continuing we obtain after steps the sets W (2p/1) > W(zg/2) > W(zen) 
with A(zg*) ~0 at Po,- Py for zg C 

Now the decreasing sequence 


W (zp) > W(zp2) -D D- 


of non-vacuous closed sets has a non-vacuous intersection (Cantor’s inter- 
section theorem). Let Yg C O be a point of this intersection. It follows that 
A(Yg*) does not vanish in U at any point of the set K. From the continuity 
of the function A(Y*) and the fact that K is dense in U we see that A(Y 5") 
is different from zero over a set J’ K which is dense and open in U. Hence 
the complement L of T in U over which A(Yg*) = 0 is nowhere dense and is 
evidently closed. We state these results in the following 


Lemna 3. It is possible to choose points Y,,: + +,¥n in any open set 
in EL, such that the functional determinant A(Yg*) will vanish in U only over 
a set which is closed and nowhere dense in U. 


5. Let P be any point of M and denote by N(P) a coordinate neighbor- 
hood of P. We suppose the codrdinates z* so chosen in N(P) that 2*=0 
at P. By a coordinate transformation in N(P) of the form = (a= con- 
stant) we can so enlarge (if necessary) the codrdinate representation of this 
neighborhood that it will contain the above cube U. We now consider 4 
covering of M by the open sets U. Since M is bicompact this covering con- 
tains a finite covering. Let such a finite covering be denoted by U,, , Ut. 

Now consider the metric space S (§ 1) the elements of which are the maps 
of class C" (r= 1) of M into EZ». Denote by & the set of all points of S which 
correspond to maps whose domain of singularity is nowhere dense in M. We 
shall show that & is dense in S, i.e. if ¢ is any point of S and p any positive 
constant there exists a point y C & such that D(¢,y) <p. Put 


Dm 


an 


me 
din 
nor 


SINGULAR MAPS OF DIFFERENTIABLE MANIFOLDS. 407 


git = + in Uy, 

where the z’s are constants. Then ¢, is a map of class C" of M into EZ, and 
hence is an element of 8S. Restrict the constants z to be the codrdinates of 
points zg in the open set | a*| << yin Ey. By Lemma 8 we can furthermore 
chose the zg* so that the set of singular points of ¢, in U be nowhere dense 
in YU. In addition by taking the above constant 7 sufficiently small the map 
¢, will approximate the map ¢ as closely as desired. Hence we can secure the 
condition D(¢,¢:) < p/t. 

We now approximate the map ¢, in a similar manner by a map @z using 
the neighborhood U, for this purpose. In fact we put 


= + in Uo, 

= in M— U0), 
and choose the constants zg* so that the set of singular points of ¢2 in U2 will 
be nowhere dense in U, and also so that D(d:,¢2) < p/t. Then the set of 
singular points of ¢2 in U, + will be nowhere dense in U; + U2. Con- 
tinuing we obtain maps -,¢+ such that for any k with 1S kSt we 
have (a) CS, (b) dx) < p/t with and (c) the set of 
singular points of in U; +: --+ is nowhere dense in U, +: Ux. 
Then y= ¢; is the desired map. In fact the domain of singularity of y is 


nowhere dense in M and 


D($, = $1) + D($1, 2) + D( gta, < t(e/t) =p. 
We have now proved the following theorem. 


TuEorEM. Let M be a compact and separable manifold of n (= 2) 
dimensions and of class C' (r=1) and S the metric space ($1) whose ele- 
ments are the maps of class C* of M into E,. Let & be the set of points in 8 
which correspond to maps whose domain of singularity is nowhere dense in M. 


Then the set & is dense in S. 


6. Since f = 0 is a map of class C’ of M into FH, the set & is non-vacuous 


and the above theorem admits the following 


Corottary I. Any compact and separable manifold M of n (= 2) di- 
mensions and class Ct (r 21) can be embedded in the Euclidean space of n 
dimensions by a map of class Ct whose domain of singularity is closed and 


nowhere dense in M. 


408 T. Y. THOMAS. 


If we extend the strict definition of the Riemann space to include spaces 
with metric defined by a non-negative quadratic differential form (i.e. if we 
omit the restriction that this form be non-singular at every point of the space) 


a second corollary can be stated. 


Coronary II. In any compact and separable manifold M of n (= 2) 
dimensions and class C* (r = 1) a Riemann metric of class C** can be defined 
which is non-singular and locally flat in the ordinary sense except over a closed 


and nowhere dense set in the space. 


UNIVERSITY OF CALIFORNIA, 
Los ANGELES. 


an 
pe 
cu 


be 


alc 


wh 


T 
(1 
de 
W 
al 
fu 
be 
Ta 
0c 
cu 
eX) 
we 
spe 
or 
firs 


THE DISTRIBUTION OF THE MAXIMA OF A RANDOM CURVE.* 


By S. O. Rick. 


Introduction. The equation 
(1) y = On, 2) 


defines y as a function of « when the values of the parameters are assigned. 
We assume thai y is single valued and real for a certain range of values of x 
and that the parameters are chance variables whose distribution 
functions are known. When a set of parameters is selected at random y may 
be plotted as a function of x. We shall call a curve obtained in this way a 
random curve. 

In general, a random curve will have a number of maxima, their number 
and positions depending upon the particular values of a,° * *,@n which hap- 
pened to be drawn. Here the distribution of the maxima of such random 
curves is studied. Although this problem is of some physical interest I have 
been unable to find references to any earlier work. Problems of this nature 
occur in the investigation of the current reflected by small random irregularities 
along telephone transmission lines. 

We shall speak of the probability P(#) that a maximum of a random 
curve will occur in a given region # of the (z,y) plane and also of the 
expected number of maxima H(f) in that region. To make these terms clear, 
we suppose sets of the parameters (a:,° are drawn and the corre- 
sponding NV curves plotted. Let N- be the number of curves which have one 
or more maxima in R and N» the total number of maxima in Rk. Thus, if the 
first curve has one maximum in R, the second curve none, the third two, and 
so on until the last one has, say, three maxima, then 


and 
P(R) and F(R) are then given by 


P(R) =lim E(R) —lim 


N N 


where the limits are assumed to exist. 
The discussion given in the section entitled “Theory ” establishes the 


* Received July 15, 1938. 


409 


410 8. O. RICE. 


following result: If suitable conditions are satisfied the probability that the 
random curve 
(2) y = F(a, * An, 


has a maximum in the rectangle (2,2 + d%o; Yo, Yo t+ dyo), da and dy, 
being of the same order of magnitude, is p(o, yo)d2odyo where 


0 
(3) P(t, PC yo, 0, 

P(&é, 9, €)A€An Ag is the probability to within first order terms that, when 
= *,dn,x2) and its first two partial derivatives with 


respect to z lie within the intervals 
oF 
Ay 


P(é,»,&) may be determined from F(a ,- - -,an,2) and the distribution 
functions of the chance variables a;,a2,- - -,@n. In general, it will involve 
Z as a parameter. The probability that a maximum occurs in the strip 
(Xo, + dzo) is 


+00 
dito Xo, y) dy. 
-00 


When F(a,°--+,a@n,x) and P(é,,£) satisfy the conditions assumed in the 
next section the probability of the occurrence of a maximum in the rectangles 
and strips just mentioned is equal to the expected number of maxima in the 
respective regions. However, for larger regions this is no longer true, and 


b d 
f da f. dyp(a, y) 


is not the probability that a maximum of the curve will occur in the rectangle 
(a,b; c,d), but is the expected number of maxima in that region. Inc: 
dentally, considerations of this sort enable us to form an estimate of the 


sinuosity of a random curve. 
Similar results hold for the distribution of the minima of a random curve, 
the expression for the probability that a minimum occurs in the rectangle 


(Xo, Lo + dX; Yos Yo + dyo) being 


P (Yo, 0, £) 


wh 


I 
V 
t 
f 
t! 
f 
a 
W 
té 
a 
st 
W 
re 
ye 
[1 
or 
re 
th 
Ay 
ap 
fo 
p\ 
ste 
me 
th 
it 


THE DISTRIBUTION OF THE MAXIMA OF A RANDOM CURVE. 411 


The example which is used to illustrate the theory indicates that the 
restrictions placed on F(a:,° - +,an,2) in the derivation of equation (3) are 


harsher than necessary. 


Theory. We assume that F'(a,,- - -,@n,2z) and its first three derivatives 
with respect to z are continuous, single-valued functions of x, and are bounded, 
but the bounds are not zero, for all possible values of the parameters and for 
the values of x in the range of interest. It is also assumed that the distribution 
functions of the parameters and the form of F'(a:,- - -,@n,2) are such that 
the distribution function P(é,7,£) is a continuous function of its arguments 
for any fixed value of x. For convenience we shall write F(a:,° 2) 
as F(x) and indicate derivatives with respect to « by primes. 

In the proof of equation (3) we find a set of conditions, denoted by J, 
which a random curve satisfies if it has a maximum in the rectangle 
(Zo, + AX; Yo; Yo + Ay). We shall call this rectangle the elementary rec- 
tangle, and take Az and Ay to be of the same order of magnitude. However, 
a random curve may not have a maximum in the elementary rectangle yet 
still satisfy conditions I. Another set of conditions II are then found which 
when satisfied by a random curve guarantee that it has a maximum in the 
rectangle. However, a random curve may have a maximum in the rectangle 
yet not satisfy conditions II. 

The probability g(Av, Ay) that a random curve has one or more maxima 
[here we wish to emphasize the exact meaning of q(Az, Ay) so we say “one 
or more maxima” instead of merely “a maximum” ] in the elementary 
rectangle is less than the probability that conditions I are satisfied and greater 
than the probability that conditions II are satisfied. It is shown that when 
Ar and Ay approach zero the expressions for these last two probabilities 
approach the same limiting form. Hence qg(Az, Ay) also approaches the same 
form in the limit. The expression is found to be p(ao, yo)AvAy where 
P(Zo, Yo) is given by equation (3). Finally it is shown, under the assumptions 
stated above, that as the rectangle becomes small the probability of two or 
more maxima occurring in it becomes small in comparison with the probability 
that only one occurs. 

If a random curve y= F(z) has a maximum in the elementary rectangle 
it must satisfy each one of the three inequalities 


(4) Yo — M.(Az)? S F(a0) S yo + Ay + M2 (Az)? 
I< (5) — M;(Azr)? < F’(2xo) 
(6) F’ (a) + (4) < 2M;3(Az)? 


where M, and M, are the upper bounds, for all allowable values of the para- 


412 Ss. O. RICE. 


meters and 2, of | F’(«x) | and | F’”(x) |. These are the conditions referred 
to above as conditions I. Let the maximum be at the point (21,41). Then 
= F(2,), F’(21) =0, << % + Az, and 


(7) Yo < F(a1) < yo + Ay. 

If ~ =2=-2,+ Az, the mean value theorem gives 

(8) | F’(x) | =| | < 

(9) | F(x) —F (ao) | S Av Max | F’(a) | < M2(Az)?. 


Setting z= 7, in the last result and using (7) shows that the inequalities 
(4) are satisfied by any random curve having a maximum in the elementary 
rectangle. 

If F’(x.) =0 condition (5) is automatically fulfilled. If F’(a) <0 
the random curve must have a minimum before it has the assumed maximum. 
Thus there are two points in the interval 7) << # < a + Az where F’ (x) =0. 
By Rolle’s theorem there is also a point, say x = 2, such that F’’(x.) =0. 
The same reasoning used to establish (9) may be used to show that 


(10) | F’(xo) | =| F’(a1) — F’ (20) | < (Az)?. 


Thus (5) is also satisfied if F”’(a)) < 0. 
In the same way it may be shown that 


(11) F’ (ao + Av) < M3(Az)?. 
From Lagrange’s form for the remainder in Taylor’s series, 


(12) +42) — + + + 02), 


where 0 = 6 =< 1, it follows that 
F’ (xo -+- Az) > F’ (20) (xo) M;(Az)?, 


and this together with inequality (11) gives inequality (6). 
If a random curve y = F(z) satisfies the inequalities 


(13) Yo + M2(Azx)? S F(a) S yo + Ay — M2 (Az)? 


II J (14) 0 < F’(20) 
(15) F’ (ao) + (20) < — M;(Az)? 


then it has at least one maximum in the elementary rectangle. These co 
ditions were referred to as conditions II. It follows from equation (12) that 


THE DISTRIBUTION OF THE MAXIMA OF A RANDOM CURVE. 413 


F’ (a + Av) < F’ (ao) + (ao) + M; (Az)? 


which together with (15) shows that F’(a) + Av) <0. From this and (14) 
it follows that there is at least one maximum of y=/F'(z) in the interval 
LX + Az, say at so F’(x,) =0. Inequality (9) then holds 
and it, together with (13), shows that 


Yo< F(x) < yo + Ay. 


Thus the maxima of the random curve lie within the elementary rectangle 
when << < + Az. 

It was pointed out at the beginning of this section that the probability 
of conditions I being satisfied is greater than the probability q(Av, Ay) of a 
random curve having a maximum in the elementary rectangle, which in turn 
exceeds the probability of conditions II being satisfied. From the definition 
of the distribution function P(é,y,¢) it follows that the probability of 


conditions I being satisfied is 


Yor (Ax)? n/Az+2M,Az 
f 
yo-Me Mz (an)? 


where VM, is the upper bound of | F’(z) . The limits of integration for &, 7 
and ¢ are fixed by conditions (4), (5) and (6) respect:vely, together with the 
conditions < MW, and However, as Ax becomes small the effec- 
tive upper limit of 4 becomes M.Ar + 2M;(Ar)? rather than M, because 
values of larger than the former make the upper limit of integration of ¢ 
less than — and P(é, 9, is zero for these values. 

Since P(é,y,) is continuous, it is certainly finite, and it may be seen 


that the integral just above differs from 


yorAy /Az 
f dé J. dn 

vo 
which is of order ArAy, by terms of order Ay(Azv)*. The same line of reason- 
ing shows that the probability of conditions II being satisfied also differs from 
this integral by terms of order Ay(Azv)*. Therefore, this integral can not 
differ from g(Ax, Ay) by more than terms of order Ay(Az)*. By changing 
the order of integration, using the mean value theorem for integrals and the 
fact that P(é,»,¢) is a continuous function of all its variables, it may be 
shown that 


q( Aa, Ay) 
>(y,, 0, 


Since P(y),0,£) is zero for ¢ less than — Mz, the lower limit of integration 


414 Ss. O. RICE. 


may be replaced by — o and we obtain equation (3). This shows that when 
the conditions mentioned in the first paragraph of this section are fulfilled 
P(Zo, Yo) AvAy is the probability that one or more maxima will lie within the 
elementary rectangle. 

However our hypothesis allows us to go further and show that this 
probability differs from the probability that only a single maximum will occur 
in the rectangle by terms of order higher than ArvAy. If a random curve has 
two or more maxima in the elementary rectangle equation (4) of conditions I 
is still satisfied. Equation (5) may be replaced by | F’(ao) | << M;(Az)’. 
This follows from equation (10) since there are at least two points where 
F’(z) =0. Also equation (6) may be replaced by | F’(ao) | < M;Az, which 
is similar to (8). When these conditions are used to fix the limits of inte- 
gration it is seen that the probability of a random curve having two or more 
maxima in the rectangle is not greater than 


f 


Since this integral is of order Ay(Azv)* and since the probability of the 
occurrence of one or more maxima is of order AyAz, it follows that as the 
rectangle becomes small the probability of two or more maxima occurring in 
it becomes negligible in comparison with the probability that only one occurs. 


Application of theory to an example. It would be interesting to apply 
the theory to determine the distribution of the maxima of a curve defined by 


nz 


which is similar to the case occurring in the telephone transmission line 
problem. However, when this done it is found that the mathematical details 
become rather involved despite the straightforwardness of the work. Since 
this reduces its value as an example we consider instead one of the simplest 
problems; namely, that of studying the distribution of the maxima of curves 


of the form 
Y =o + + az’ 


where the parameters po, a1, a2 are independent, each being distributed about 
zero according to a normal law with unit standard deviation. The requize- 
ments that y and its first two derivatives be bounded are not met by this 
example, but the reasonableness of the results indicate that the answers are 


correct. 
When we differentiate and obtain the first and second derivatives of 4; 


an 


| 
M 


THE DISTRIBUTION OF THE MAXIMA OF A RANDOM CURVE. 415 


y = 2a, + 
y” 


we see that y has a maximum only if az is less than zero, the probability of 

which is one-half. Also, if the maximum occurs it is at the point = — a;/d2 

and the distribution function of —a,/d2 accordingly, gives us the distribution 

of the maximum of y. These two conclusions will be used to check the theory. 
In order to apply the equation 


(3) Ply, 0, 


to find the probability p(a, y)dzdy that a maximum will occur in the rec- 
tangle (7,7 + dz; y,y + dy) it is necessary first to determine the distribu- 
tion function P(y, y’,y’””). The expressions for y and its first two derivatives 
show that they are linear functions of the normally distributed chance variables 
ly, 41, 2. We may therefore apply a theorem? in probability theory to find 
the required distribution functions. 

(2a) -8/2 


+ 2Bioyy’ + 2Bisyy” + | 
where B is the determinant 
1 + 42? + 4x + 22° 
+ 22° 4 -+ 42? 4x |= 16 
4x 4. 


and B;; is the cofactor of the element in the i-th row and j-th column. Since 
in equation (3) y’ is zero we are interested only in By, Bis, and Bis. 


B,, = 16, By, = 827, B33 = 4(1+ 2? 4 2), 


and we have 


Py, 0, = — exp[—£- |. 


The integral for the distribution function p(z, y) of the maxima becomes 


28 


= rV 2 


e [1 + erf (x?sy) | 


*A discussion of the general theorem has been given by van Uven, K. Akad. ». 
Wetensch. Amsterdam Proc. 16, 1124-35 (1914); see also S. O. Rice, Quarterly Journal 
of Mathematics, Oxford Series, vol. 9 (1938), pp. 1-4. 


416 S. O. RICE. 


where 


s—1/V20 +242), orf (u) 


The probability that a maximum lies in the strip between x and x + dr 
is obtained by integrating p(z,y) from y=—'o to y=-+ o and multi- 
plying by dz. The integration is not difficult to perform and gives 


+00 dx 
(16) de 9)dy = 


We are now in a position to check the conclusions we drew at the 
beginning of this section. The expected number of maxima per curve is 
obtained by integrating the probability that a maximum occurs in a strip of 
width dz from x =— » tox=-++ «©. Thus the expected number of maxima 
in the interval — 0 <«< ++ o for the quadratic curves of our example is 


dw dyp(x, y) +2") 


2 


which agrees with our first conclusion. 

To use the second conclusion to check our theory we have to show that 
the point «= —<a,/d2 at which the maximum occurs has a distribution func- 
tion which is closely related to the expression (16) which our theory gives for 
the probability that a maximum lies in the strip (z%,x-+ dz). The exact 
relation which must be satisfied is as follows: 


[ Probability that a max. lies in z, x + dz] 


= [Probability that a2 < Probability that < <a-+ dz]. 


Equation (16) tells us that the expression on the left is dr/[27(1 + 2’)]. 
The probability that a, is negative is 1/2. Furthermore, it is not difficult to 
verify that the distribution function of «=—da,/a, where a, and dz are 
normally distributed chance variables is 1/r(1-+ 2”). The two sides of the 
above equation are therefore equal and the second conclusion checks our theory. 


BELL TELEPHONE LABORATORIES. 


re! 


T 

f 

a 

by 
| 


THE CLAMPED SQUARE SHEET.* ! 


By D. G. Bourerny. 


The equations of elasticity for the thin plate problem contain planar 
tension and shear terms, together with bending and vertical shear stresses. 
For the moderately thin plate with deflections small compared with the thick- 
ness the planar stresses and second order terms in the strains may be neglected, 
thus leaving a single thin plate linear equation given in the standard treatises 
and extensively investigated in the literature. 

The term “sheet” is introduced in this paper to designate the opposite 
extreme, namely that of ultra-thin plates with large vertical deflections for 
which the plane stresses alone are significant. This sort of problem arises in 
aeronautics and kindred fields where paper-thin coverings are used. Second 
order terms must be retained. ‘Two non-linear equations are fundamental. 
The superposition principle in the deflections is no longer applicable. Although 
an Airy’s type function is introduced, it refers to plane stresses and not to 
vertical deflections as in the case of plates. New boundary conditions appear 
and require special interpretation both in the discussion of the solution and in 
the uniqueness demonstration. Other differences with the moderately thin 
plate investigations will be apparent in the sequel. 

The specific concern of the present work is the clamped square sheet. So 
far as the writer is aware, this problem has been treated by two authors only ; 
namely, Hencky* and Foppl.* These authors are avowedly interested solely 
in numerical approximations and make no pretense of attempting a rigorous 
solution. 

The method of Hencky consists in replacing derivatives by finite differences 
and solving the resulting system of simultaneous linear algebraic equations. 
Even for three subdivisions the labor of computation is quite formidable. 
Moreover, to the writer’s mind, the boundary conditions used are incorrect. 
The method of Foppl consists in approximating the displacements z, uw, and v 
by single constant products of the form csin z/asin y/a; Uo sin 2a/a sin y/a; 
% sin v/a sin 2y/a, (C, Uo, Vo are undetermined coefficients) plus the further 


* Received September 8, 1937; Revised July 5, 1938. 

1Presented to the American Mathematical Society, January 1, 1936. 

*H. Hencky, Z. fiir Ang. Math. und Mech., Bd. 1 (1914), pp. 81, 423. 

*A, and L. Foppl, Drang und Zwang, 2nd ed., p. 226. This joint work will be 
referred to in the singular. 


417 


o 
e 
{ 
a 
it 
ct 
0 


418 D. G. BOURGIN. 


arbitrary assumption that the pressure is then constant, and then applying the 
currently popular Ritz method to determine the relations between the three 
constants. The objections to be urged against his work are that the three 
assumed approximations are not compatible with the proper boundary con- 
ditions, and no criteria are provided for determining the deviation of any 
results obtained from the correct expressions. 

The problem formulated by Hencky and Foppl is that of uniform pressure. 
The determination of the stresses and strains for assigned pressure will gen- 
erally involve long and unwieldy expressions. In principle, the problem of 
uniform pressure may perhaps be solved by using series expansion for the 
vertical deflection, the coefficients being determined to satisfy the uniform 
pressure condition. The general method is clear from the case, treated here, 
of a single such term in the vertical deflection. We may alternatively look upon 
the present research as a complete solution of a sort of converse problem— 
namely that of the determination of the stresses, strains and pressure for a 
particular assigned vertical deflection. For comparison with Foppl’s analysis, 
the deflection z is, except for notation, precisely that adopted by him. It is 
thus possible, as a result of our work, to gauge the departure from the con- 
stancy that Foppl assumes for the pressure. Moreover, the accurate expres- 
sions for w and v may be used to determine the adequacy of the arbitrary 
approximations made by him. 

In connection with the practical computations an important paper by 
March * on the moderately thin plate ought be cited. His procedure, like ours, 
involves the solution of an infinite system of linear algebraic equations though 
his empirical method of numerical solution is given no justification in his 
paper. He demonstrates uniqueness of the Neumann expansion, our Kq. 33, 
and anticipates the idea behind our Kq. 38. 

In the main, two methods are followed below. The first makes the stresses 
central, which seems to be a comparative novelty in Elasticity theory; and the 
second, the strains. Uniqueness of the solutions is established, and the pro- 
cedure is rigorously founded. 

We make our problem concrete’ by postulating a square sheet of side 7 
placed along the codrdinate axes in the first quadrant. For convenience in 
stating the formulae the following conventions will be observed: 

P before an expression containing z and y indicates that a similar ex- 
pression with y,z replacing z,y is to be added. @Q implies that the second 
expression is to be subtracted. 


4H. W. March, Transactions of the American Mathematical Society, vol. 27 (1925), 
p. 307. The writer is indebted to the referee for this and other references which have 
allowed elision of overlapping material. 


is 
W 
th 
(3 

(3 

wh 

(3. 

ari 

wit 

(4) 
Fro 
Mur 

more 


THE CLAMPED SQUARE SHEET. 419 


8S, C, T, Sh, Ch, Th, Sch, are abbreviations for the sine, cos, tan, sinh, 
cosh, tanh, sech, respectively. Example: 


sinh x cos y + sinh y cos x 


Shzx Cy =. 
sinh x cos y — sinh y cos z. 


Q 


The differential system. The following equations are fundamental.® 


= Ug + = Ou/dx + (02/0x)?/2 
(1) = Vy + 47/2 
= Va + Uy + 
T, + 2pe, = + vee), 
(1.1) T's = AA + 2peo = A (es + 
= pers = A(1— oo) 12/2, A=A+ 


where A, » are the usual elastic constants, o is Poisson’s ratio, the e’s are the 
strains and the 7’;’s are the tensional stresses with S, the planar shear. 
u, v, z are the deflections in the a, y, z direction, respectively.® 

The assumption that the stresses are derivable from an Airy’s function 
is sufficient to uniquely determine the correct equations for the sheet. Thus 
with 
(2) T, = Uy, = 
there follows 


OS 

0x oy 

0A Oe 0A 02 
3.1) Aa Gy (Ate) +e 


== 0) 


where V? = + 6?/dy?. Similarly 


(3. 2) (A+ p) + 


arises from 07'./dy + 08,/ée 0. On differentiating equations 3.1 and 3. 2 
with respect to a and y there results, after evident reduction, 


(4) (A + 2p) V2A + — Zay?) = 0. 


From Eq. 1.1, (A+ 


5A, E. H. Love, Theory of Elasticity, 4th ed. for notation. 

*For very large or “finite” displacements, the far reaching theory of F. D. 
Murnaghan, American Journal of Mathematics, vol. 59 (1937), p. 235, must furnish the 
more general equations corresponding to Eqs. 1, 1.1. 


l 

i 

‘ 
n 

e 


420 D. G. BOURGIN. 


(5) = — A(1— 0”) — |. 
This is in agreement with the result given by Love.’ The equation of equi- 
librium for the vertical stresses is easily derived in the form? 
(6) T + + = — p. 


For the clamped sheet, the conditions at «= 0, am are u, v, z= 0 and 
hence the y derivatives to any order vanish. A similar statement holds for 
the boundaries y= 0, az with x replacing y as the differentiating variable. 
To determine the boundary conditions in terms of the stresses we revert to 
Kq. 1. 1 from which it is patent that 
(7) | Ao + | == of’; | = pl’ | 
(7. 1) T; | pl’ |y=0,07 


with p=o/o+1. 
Evidently 
(8) ,/dx | o=0,08 == A + + | 


(8. 1) | — (A(o 1) /2) Vay | 


According to Eq. 2 we have 


l+o 


(9) (Ure + | Vay | o-0,07- 
Now 
(8. 2) OT 2/02 = A(Vay + (Use + ) | 


Combining Eqs. 8 and 1.4 we may write 


OT oT’. 


with 1 = — (1+ .)/(2+ 0). Eq. 7.2 involves the existence of third deriva 
tives on the boundary. The physical implications are retained in the weaker 


4 Weer)| 


(7.2) L(2,y) 


2=0,aT 


condition, suitably interpreted, 


y 


(7.21) : 
f E(y, |y=0,ar dx = 0. 
at /2 


7 Love, loc. cit., p. 558. 


I 
e 
a 
01 
| dx | of 
| 
un 
a 


or 
le, 


a- 


THE CLAMPED SQUARE SHEET. 421 


Kq. 7. 21 must be interpreted as a limit result for a sequence of integrals 
taken along parallels to the boundaries. This convention is denoted by the 
letter F. Specifically, F implies in detail 


Yo 


(1.3) 
Ly a 


an/2 


y) iy) 


1/2 


where Z,¥ are the codrdinates of an inner point and 2°, y® of a point on the 
boundary of the square and a may be either xz or y. We observe in this con- 
nection that the continuity of Ugac(x, y) in both variables in the interior of 
the square is implied in the orthodox conception of a solution of Eq. 5. 
Fortunately, the solution U obtained in this paper is furthermore such that 
the integral on the right-hand side of Eq. 7.3 is of class C® in the closed 
square. Accordingly, we are spared the otherwise necessary consideration of 
mode of approach of Z, ¥ to 2°, y°. Of course the integral on the left side of 
Eq. 7.3 may also exist as an ordinary boundary integral with a value differing 
from that assigned by #. Convention F and the boundary conditions in the 
stresses, namely Eq. 7, 7.1, (7.2) and 7.21, do not occur in the literature 


of Elasticity Theory. 


Uniqueness of solution. The set of Eqs. 5, 6, 7, 7.1 and 7. 21, involve 
the dependent variables U, p, z. We now demonstrate that for assigned z of 
class C* in the closed square, the solution U, supposed to exist, is unique 
except for linear terms in x and y under fairly general conditions. Actual 
solutions are exhibited later for U and p for the special assignment in Kq. 11. 

We require of a solution U that (a) Eq. 5 is satisfied in the open square. 
(b) Uaavy is absolutely integrable over any closed rectangle in the open square. 
a,b may separately be either x or y. (c) U is of class C* in the open and 
C? in the closed square. (d) All indefinite integrals in the z or y direction 
of third derivatives of U are of class C® in the closed square. (e) Eqs. 7, 
7.1 and 7. 21 are satisfied. Eq. 7. 21 holds in the sense F for the boundary I 
of the square. 

Suppose there could be two solutions U* and U". We form U” = U*— 0° 
which satisfies the continuity and boundary value conditions assumed for U4 
and U® and is a solution of Eq. 5 with right-hand side 0. 

Consider the Calculus of Variations problem connected with minimizing ° 


*Throughout this paper the terminology class ©" implies continuity through 
derivatives of order n in both independent variables. 

*This variational integral is introduced in an ad hoc fashion merely to establish 
uniqueness, and involves the Airy’s function in the plane stresses of a sheet whose edges 
are clamped. There is a striking formal resemblance (when 1 + ¢ is replaced by 1 —c) 


|| 
rd 
To 


422 D. G. BOURGIN. 


(10) fo “((W2U)?—2(1 + 0) (UseU y — Uey*) 


(10. 1) + Uy? — 20U yy + 2(1 + 0) 
0 0 


for functions U subject to the continuity stipulations catalogued above for U. 
The separate integrals in Eqs. 10 or 10.1 exist and are finite since U is of 
class C? in the closed square. With o < 1 the sum of the first three terms of 
the integrand of Eq. 10.1 is =0. Accordingly, the integrand is positive 
definite in the quantities Ucr, Uy, Uy. Hence the minimum is actually 


attained by the unique choice, 
(10. 2) = Uy = Vay = 0 


and the continuity requirements preclude existence of any exceptional points 
in the closed square. The symmetry of U as regards x and y bars the intro- 
duction of linear x and y terms obtained on integrating the second derivative 
in Eq. 10.2. Thus, designating the solving function as U’, we have 


(10. 3) U’=C+sx-+ ty. 


The solution of the variational problem may alternatively be determined 


from 
(10. 4) == 0 = 


{ f (U) dedy 
11 


where I'¢,,n, is the boundary of a rectangle, m Sy Sar—m, 
€i,4j > 0 interior to the square. 6U may be taken as an arbitrary function 
of class C? in the closed square. s and n refer to directions along and normal 
to the boundary. The application of Green’s theorem implied in the right side 
of Eq. 10.4, for each of the approximating rectangles, is justified in view of 
hypotheses (b) and (c) and the restriction on 8U.1° We proceed to show that 
U” is the solution required in Eq. 10. 4. 


to the potential energy expression in the vertical strains for the free boundary problem 
of the thin plate. Cf. Courant-Hilbert, Methoden der Mathematischen Physik, 1st ed. 
p. 149. 
*° For completeness we should include the conditions 
ay (4 y) = ay B) 
where a, 8 are coérdinates of a corner. However, these corner conditions are auto 
matically fulfilled in view of (c). 


THE CLAMPED SQUARE SHEET. 423 


It is at once verified that the Euler Lagrange equation, L(U) —0, is 
precisely the homogeneous form of Eq. 5 and is satisfied by U =U’. Hence 
the limit of the double integral in Eq. 10. 4 is 0. 

Next we remark the easily verified relation 


(10. 5) I, | c=0,07 | --— + (1 -}- a) U ee | | 


and the analogous identity for the sides y= 0,az. Hence, in view of Kgs. 7, 
7.1, J, vanishes on the boundary T for U =U”. According to (c) J, is con- 
tinuous in the closed square. Hence the limit of the first boundary integral 
in Eq. 10.4 is identical with the ordinary integral over the boundary T and 
has the value 0. 

We turn to the second boundary integral. Evidently 


=H (z,y) 
— L(y, | 


(10. 6) 


I, 


On integrating by parts, as is permissible in view of (c) and the restric- 
tion on 6U, we have for instance 


1 


11 


In view of (d) the indefinite integral of EL (e, is of class C° ine and y. It 
follows that each of the two expressions on the right-hand side are of class C° 
ine and for 0 Sar, j= 1,2. The order of the operations of in- 
tegration and passage to the limit e— 0 for the definite integral on the right 
side of Eq. 10.61 may therefore be interchanged. We recall the interpreta- 
tion of Eq. 7. 21 in the sense F. It is clear then that, with U = U”, the limit 
of the right-hand side of Eq. 10. 61, for e, 7; > 0, j = 1, 2, is 0 and is taken on 
uniformly with respect to « and m, 2. An analogous result holds for approach 
to the sides x = ax or y = 0, am of the appropriate integrals of the type of the 
left side of Eq. 10.61. Accordingly, the second boundary integral in Eq. 10. 4 
goes to 0 uniformly for all modes of approach of T¢,,n, to T when U = U”. 

We have verified that U = U” is the solution in the sense F of 8J = 0. 
Qn taking 87 =U, as is permissible, it is clear that D SJ ¢,,.n, = 2d. 
Hence U” yields an absolute, and not merely a relative, minimum for J. 
The variational problem and that of the homogeneous differential equation 
system, i.e. Eqs. 5, 7, 7.1 and 7%. 21, are one whence the conclusion 


(10.7) =U” =C+sr+ ty. 


of 
of 
ye 
ly 
ts 
ie 
al 
of 

m 
0- 


424 D. G. BOURGIN. 


Since Ug, Va, T;, T'2, p, = 1, 2, involve U through its second derivatives only, 
their uniqueness is established. 
We consider the case 
2 


11 S 
(11) 


& 


Method I. The first method of determining a solution of the clamped 
sheet problem subject to Eq. 11 utilizes Eqs. 5, 6, 7, 7.1, 7.21. The demon- 
stration of the validity of the ensuing formal developments is reserved to a 
later position in the paper. 

Consider first Eq. 5. A possible solution is 


(12) = — d?/8a*|C 2x/a + C 2y/a] 


where d = C[A(1 —o’) 


(13) iT | p = — 2c? y/a] or — d?/8a?[2C? x/a]. 


T 


We wish next to determine »7', a harmonic function, such that 


T\p—— 


The formulae and later convergence discussion are simplified by replacing 
W 
C?—,w=2,y of Eq. 13 by 1 — S? — 
a a 
m 
Ch—(=~—y 
8 a\2 me 


ol d’/8a [2+ 2 


2 


We write 


m 


The symmetry of the problem requires summation over odd subscripts alone. 

Throughout this paper all summations are to be understood in this sense. For 

the sides y = 0, am it is merely necessary to interchange y and x in Hq. 14. 

The crux of our problem is essentially the determination of the values {hn)}- 
We now exhibit T, + .7+,7 


(15) T= d?/8a?(2—C 2a/a—C 2y/a + 2/43 PAmfin(2) y] 
where Am = hm + 8/m(m? — 4) 


on™ ™ 


Ob 


Sin 


| 
( 
( 
As 
su 
(1 
(1 
ject 
on { 
or } 
our 
Exp 


THE CLAMPED SQUARE SHEET. 425 


On making use of the Green’s function, for instance, it is not difficult to 
find a solution of V?V = T, namely 


(x — arr/2)? 
2 


(16) V = d?/8a? | P 


where 


w Sh (am —w) + (a —w) w 


Jn(w) = 
2 Ch? 


Manifestly, U = V + W also satisfies Y?U =T provided W is a har- 
monic function. We determine this harmonic function by the conditions Eqs. 
7.1. 

(7. 01) (Waa — pT + Var) |o=0,07 = 0 


(7. 11) (Wee — (1—p)7' + Vaz) = 0. 


Assuming the boundary values are taken on continuously, as indeed we show 
subsequently, we may write 


2 
(17) V [ = S my | 


Observing that 
2 8 
—4) 
we have therefore 


d? ay 2 m ] 
(17. 1) =0,a7 a2 1 (1 p) sh S a 


8a? a 
Similarly 

d? Mx 


We are faced with the problem of determining a harmonic function sub- 
ject to the condition that certain second derivatives take on prescribed values 
on the boundary. This seems somewhat different in form from the Dirichlet 
or Neumann problem of potential theory, but the solution is easily found in 
our simple case from the observation that W..2 must itself be harmonic. 
Explicitly then, 


m (m? — 4 


5 Ofm(y) — Wap 
12 


a 
— Amgm(Z) vt ] 
| 
‘or 
4, 
) 


426 D. G. BOURGIN. 


The value of W derived from Kq. 18 is subject to the arbitrary inclusion of 


rey + sx+ty+C. 
Symmetry conditions require rs = ¢ —0 leaving open the value of C a 
has already been remarked in the discussion of the uniqueness of U. (So far 
as the tensional stresses are concerned, the values of s, ¢, C are unimportant, 
and a non-vanishing r would make its presence evident in Uz, alone). 
For purposes of completeness it is desirable to give the explicit expressions 
for T,, T2, S;. Thus 
Ts (a, y) = T2(y, 2) 
(19) T1(2,y) [1—0 2 + = 
m 
(Am x) S— +5 S 


@ 
at /2 


ar ML 


Sh — 
2 


a 
8a? Ch Sy 2 
Kq. 7.2 would yield series (and equations) in mh» rather than in hw. 


More seriously however, the series mh» S Gm = fim OY — Jm, Tequires 
a 


conditions for its introduction which we are not prepared to justify. Accord- 


ingly, we replace Kq. 7.2 by Eq. 7. 21 


y (0 
21)%? C Weee)| dy = 0 


m (m? — (1—p)hm Fm(y) dy a 


2 
y m\2 
+ Am f gm(y) dy (“) 


4 The symmetry in « and y of U,, eliminates extraneous y functions introduc? 


by the integration. 
** The term by term differentiations here and earlier are justified by the unifor® 
convergence of the resulting series which are included in Eq. 22. 


(2( 


(20 


i f 


ons 


juced 


for 


THE CLAMPED SQUARE SHEET. 427 


In view of elementary properties of hyperbolic functions and the relation 
l(1— p) =— (1+ 1) we may write this in the form 


(19.1) — { + Shm 


8 T 


. 8 1+] 
f(y) dy 


a 


4lm y 


am /2 


The Fourier series for fm and gm, assumed to be odd functions, and for 
their integrals are 


8m pa py 8 ma? py 
gn(y) m?)? a ? gm(y) dy = (p? + m?)? C a 


On using these expansions we obtain on the right side of Eq. 19.1 cer- 


tain double summations }) }. We first interchange the order of summing. 
m 


” in the sense of tensor 


Then since the summation indices are “ dummies 
analysis, the scripts m, p may be replaced by p,m respectively. Accordingly, 
the right side of Eq. 19. 1 becomes 
, my 8 1 
(p? +m’)? p?—4 


The two sets of terms involving h, may be combined to yield 


(m? +- 


To evaluate the summations involving we use contour integrals 


—4 
1 
(20 aS, 
44 Jc 4) (2+ m’) 2 p? m* 
(p = 1, 3,- -) 
1 1 2? an ] 
20.1 {= 
24 (22+ m?)? 2 p> —4 (p? + m’)? 


of 

as 
far 
int. 
| 
h 
ires 
ord: 
|_| 


428 D. G. BOURGIN. 


The integration path C may be taken as two Hankel loops. The first starts at 
coo + «tl, approaches its vertex at z 4 through valves in the first quadrant, 
then doubles back below the real axis to 0 —ei. The second is the reflection 
of this in the imaginary axis.'‘* The loops may be closed by verticals away 
from the poles at z= + mi and by allowing these gates to approach + «, 
— oo respectively the relations Eqs. 20 and 20.1 are easily verified. The 
proof that the path integrals remain finite is simple. We observe now that 
the integrals may also be evaluated by considering the residues associated with 
the singularities along the imaginary axis. When this is done, there is obtained 


1 1 T 1 mar 
(21. 1) 24 (p?+m?) 4m 2 2(m? + 4) (m? + p?)? 
1 


Kq. 19. 1 is brought to the form 
(19. 3) Sym = 0. 


If the series on the left is a Fourier cosine series, it is well known that it must 
be unique. Hence the coefficients may be separately equated to 0. Thus 


Mr 16m mr 
(19. 4) [ (: + Ip + (1 + +3 € + oh 


321 m? php 
m(m? + 4)? + 
» 


2 


18 Alternatively, the contours may be taken as the two lines z=-+ 3. Another 
method of calculation, used as a check, is based on an extension of the usual Parseval 
theorem. For the applications we require the algorithm, certainly valid for functions 


of class C’, 
f F(t) F,(«— t)dt = Oma = (2) 


us 


where 
Sma Sma 
™Oma’ 


The procedure is summed up in the formula 


On = = 2d 


Tv 
anf da = =2g,,Cmz. 
It enables us to sum series such as 
Zinl (m? + a?) (m? + p?) (m? + r?)]", a, p, r real 


for the functions, associated with (m+ a?)— ete. as Fourier coefficients, are know! 


qi 


tin 
Thi 
ig 
the 


d 
( 
( 
; My 
re 
fo 
fo 


ust 


THE CLAMPED SQUARE SHEET. 429 


On replacing 1, p by the o equivalents, we have 
Mr 8 (1 a) pho 


16m 64(1 +c) 


This is the central equation. 

The legitimacy of the procedure of the Method I will now be investigated. 
We must justify (a) obtaining the boundary values of the functions defined 
by series by introducing (y= 0,a7) in the terms of the series, 
(b) the grouping of terms of the several series, (c) the interchange of sum- 
mation order in m and p, and (d) the assertion that the sum on the left side 
of Eq. 19. 3 is a Fourier series. 

We anticipate the key inequality of Eq. 38, namely 


(38) | km |S E/m**®, 1>8>0 for 6 <5/%.3 


and it is for such o values, alone, that the following demonstrations apply. 
The series we have to consider are of the form 


qnu(w) represents alternatively fm(w), gn(w), < f'm(w), J m(w), 


w m m 2 
f im(w) dw, f (*) Gm(w) dw. 
atr/2 ar/2 \@ 


The prime denotes differentiation. pm(t) is either S or 
4m stands for hm or 1/m(m*—4), 1/m(m?+ 4), 1/m(m?+4)2 We 
remark two important characteristics: (i) (qm(w) pm(t)) is of class C° in the 


closed square. (ii) | gm(w)pm(t) | <M,0= <= an. 


Evidently 3 | am| is a convergent series according to Eq. 38. It then 
follows from (ii) that the series of type Eq. 22 converge absolutely and uni- 
formly in the closed square, thus justifying operations (a) and (b). 

Furthermore, in view of (i) the functions defined by the series are of 
dass C°, Starting with Eqs. 19 we may easily write down the explicit formal 


“It is elementary of course that the Fourier series of a function with a discon- 
tinuity at 2 = 0 does not approach the value of the series at the point from either side. 
Thus, for instance, Eq. 14 does not really posit 0 values for 7 at the corners. This fact 
8 rather a product of the analysis and is connected with the uniform convergence of 
the series Eq. 22. 


at 
t, 
on 
ay 
he 
at 
th 
ed 
her 
jons 


430 D. G. BOURGIN. 


expansion of the right hand integral, exclusive of limit sign, in Kq. 7.3. 

The expansion in question is a sum of series of type Eq. 22 where qm(w) is 

restricted to the last four expressions catalogued above. It is accordingly of 

class C° as was asserted in connection with convention F. 
We now consider (c). The series 


p m py 


3 


+ Pp 


are evidently absolutely convergent double series. Therefore the order of 
summation may be interchanged, as was done to obtain Kq. 19. 2. 

We proceed to (d). Each term in the left hand bracket of Eq. 19. 4 is 
dominated by M/m‘**, The only term for which this assertion is not 


immediately evident is 


Rm = 1/Th mx/2 m?php/(m? + p*)?. 
Dp 


However, on writing | hm | = L/m‘* in place of Eq. 35, it is clear that Eq. 
36, 36.1 and 37, cf. sequel, establish this sort of dominant for 


€ 1l1+o mr ) 


Sh mr 


and accordingly for Rm as well. Hence ym, i.e., the left hand bracket of Eq. 
19. 4 is inferior to y/m*** where y is a suitable fixed constant. Therefore 


= ym < 


The Riesz Fisher theorem now guarantees that the left hand side of Eq. 19.3 


is a Fourier series. 
Our derivation of Eq. 19. 4 is therefore unexceptionable. 


Method II. Eq. 19.5 (19.4) may also be arrived at by a different 
method, which besides its independent interest incidentally affords a check 
the accuracy of that equation. Except for possible occurrence of sines insted! 
of cosines the series encountered below are of the types Eqs. 22. 1—22.3. 
Accordingly, the validation of the operations conducted in the developmet 
of Method II is really already implicitly subsumed in the discussion of tho 
equations. 

The starting point now is Eq. 3.1 which may be written in the form 


1 


2 


(3. 11) 


(5 
{0 
tal 
ot 
(2 


) ig 
of 


not 


THE CLAMPED SQUARE SHEET. 431 


If the last term on the right be dropped, a solution of the resulting equation 
is evidently 


1 OV 


+> 34m gm(y) | | cf. Eqs. 11, 17. 


A solution of 


= —22V 722 = [ s = —S = 


a a a 


(23. 1) tle = 4) C ‘ 


We require now a harmonic function uw vanishing for y—0,am and 
such that 
U2|e=0 = — (Wj + U2) | Us| Us| 


On making use of Eq. 20 we have 


8m*p 
[( 2 Am | (m? + p*)? 


hot 
PY Sh ma + mx a\2 g my 
giz jon 
2Ch? Ch — 


pr a 
2 2 2 


uz vanishes on 2 0,az and takes on the valuc 


(24. 1) /8a? E 


lory=0, am. To balance this we introduce the harmonic function wu, which 
takes on the negative of the values in Eq. 27.1 for y— 0, am and is 0 on the 


other two sides, viz. 


02 2Q22 2 far 
(25 
8a? 2 a a\ 2 y 


The required solution of Kq. 3.11 with u!p =0 is 


of 

A is 

1S 
Eq. 
9.3 
rent 
Ol 
) 2 
hose 


432 D. G. BOURGIN. 
(26) w(x, y) = Uy + U2 + Us + Uy = 0(Y, Z) 


2 [ar 
| 


Chr 
o mx m my 
ES (am —2) — Sh 2— "2 0h ™ (a —2) + (an —2) ™ On 
2Ch 
8m?p sb g PY 
+ m*)? “a 
2 
__ Sh mn a\2 ad 
ma af 
2Ch 9 Sh 5 
We recall that 
1 
2/¢ 
Us + /2\e=00" = FTG) T 
my 


Thus after some direct reduction 


2 (ar 
Ch - 


Cha 


Py 
8mp? 1 8h Mr — Mr 


a(m* + p?)? pr mar 
Th 2 Ch? Th— 


| 
28 +- 


—(1—«) 2S in 8 


The regrouping of terms taken from the set {ur}, r—=1- - °4 is allow 
able here since all the series entering are absolutely convergent in view of 
Eq. 38. The summation order on p and m may be interchanged and the 
scripts p,m substituted for m, p by the absolute convergence property of 


Ammp?/| (mm? +p)? Th g 


Guta 


( 


THE CLAMPED SQUARE SHEET. 433 


By using a contour integral of the type introduced in Eqs. 20, 20.1 or 
the extended Parseval theorem, we may show that 


‘ 1 1 


a +4 1 


On replacing the terms outside the summation sign in Eq. 27 by their 
absolutely and uniformly convergent Fourier sine expansions, we may express 
that equation as 


(27.1) hm Sy 0. 
Absolute convergence of the sine series whose coefficients are 


1/m(m? +4), 1/m(m?—4), m* hyp/(m? + p?)?, 


justifies the attendant groupings. Furthermore, since by Eq. 38 the coeffi- 
cients in each of these series go down at least as rapidly as m~}~° we conclude 


that 


| Bm | = B/m** for o <5/7.8, 1>8>0. 


in the same manner as in the case of Eq. 19.3, we show that the left side of 
Ey. is a Fourier sine series and hence that Bm ==0 where Bm(=ym) 
's precisely the expression in Kq. 19. 5. 

In view of our use of convention F' and in the interests of completeness, 
we consider the displacements on the basis of the stresses derived in Method I. 
{Incidentally we are enabled to exhibit an alternative expression for u(z, y) 
somewhat simpler in form than that of Eq. 26. In fact Eq. 1 yields 


(tty 53/2) de (A(1—o?))* f (T, —oT») de 
a atr/2 


(28) y) = c?/8a CG —2)+ —as S 
3 3— 
m am a g mx ) 
j — xfm(x) + + gm(y) ( 
a 


Where om, = 8/m(m?—4). The earlier reasoning, involving Kq. 38, estab- 
lishes that u(a,y), ux(z,y) and u,(z,y) are represented by absolutely and 


434 D. G. BOURGIN. 


uniformly convergent series of continuous functions and are therefore con- 
tinuous in the closed square. Moreover the termwise integrations and dif- 
ferentiations leading to u,(z,y) are valid. 

Now, wy(2, y)|2-0,ar turns out to be identical with the right hand side of 
Eq. 7.21 and therefore vanishes. That is to say, u(z,y) is constant on 
z=0,anr. The continuity of u(z,y) justifies our writing 


(29) u(0, y) = u(0, 0) = c?/8a2[can/2 + — am]. 
Furthermore, 
(29.1) —x) + 00/28 + (o%na/m)C 


The summations in each of Eqs. 29 and 29.1 are easily evaluated, by inte- 
grating the sine series for S? x/a, whence it is seen that the right hand sides 
of these equations vanish. Similar deductions apply for the sides z, y = ar. 
We have thus established that 


r= v(y,2)|p = 0. 


y) 


Since uy(x,y), vz(2,y) and 2,2, vanish at the corners, evidently S, (cf. 
Eqs. 1.1) vanishes at the corners. From Eqs. 19, however, we should get for 


the corner 0, 0 


— 8, (0,0) = 2d?/8a?PS { (Am — 2phm)Th — Am Sch? mx/2}. 


The prescribed vanishing of the right hand expression above furnishes a check 


on the computed values of hm. 


Solution of the algebraic system. We turn now to the consideration of 
the solution of Eq. 19.5, which may be written for convenience in the form” 


(31) hm — Kn(o) Cm(o) 
where 

Km(o) = b(c)/Th “(1 — G(c) =) Smr = m?r/(m? + 7?) 


7... 16m 
Cm > E | 
m(m? +- 4)? m* — 16 1 — Gmx/Sh mx 


We wish to bound Km(o) Clearly, 


15 The positiveness of a,,., K,,(¢), B,,, allows us to dispense with absolute value 


signs, but is in no wise essential for the proofs. 


( 

| 

| 

| 


THE CLAMPED SQUARE SHEET. 435 


m?%r/(m* + 12)? + 
Now 
4 1 3 5 y¢ * 00 rdr 


Therefore the left side of Eq. 32 is inferior to .282. We note 


Km(e) = K,(oe). 
The condition 
(32. 1) Kn(c) 1 


is satisfied for w = K,(o) (.282) <1. Since K,(e) is an increasing function 
of o from 0 to 1 and K,(19/20) (.282) < 1, the inequality Eq. 32.1 is verified 
for o < 19/20. 

The restriction of interest at this stage is | hm |< M. Evidently | Cn(o) | 
is inferior to a constant D for all n. The solution of Eq. 31 is unique if 


Kq. 32. 1 is satisfied, and may be represented by a Neumann expansion 


(33) hm Cm + K + KmGmrK -GreC's 


where the argument o has been omitted in the C and K terms.’® 
One method of approximation, in principle, to the true value of hm is 
that of using a finite number of terms in the expansion to the right of Eq. 33. 
The error committed in stopping at the n-th term is inferior to Dw"/1 — w, 
where | | << D< o. Actually, however, although Km may be 


calculated accurately, by using contour integration for instance, the value of 


YE is not so amenable.47 Eq. 32.1 guarantees the applica- 
r 8 


bility of the practical method of finite segments.1* This procedure amounts 
to setting Am 0 for m=2N-+1 and solving for the first N h’s from the 
first N equations of Eq. 31 by Cramer’s rule. The approximation obtained 
in this way denoted by hm. 

We make the application of the method of finite segments more precise 
by exhibiting bounds for the error in hm. The solution of the N equations 
in VN unknowns may, of course, also be written as a Neumann expansion when 
Eq. 32.1 holds. We denote by kn the solution of the set of equations 
obtained by replacing Cm by its absolute value. Accordingly 


* March, loc. cit., p. 311. 

The difficulty arises from the fact that the summations yield the sum of two 
logarithmic first derivatives of the gamma function, with conjugate complex arguments, 
which are not tabulated so far as the writer is aware. (The terms in mm/Shmm give 
no trouble since they are negligible for large m). 

* Pellet, Bull. de la Soc. Math. de France, t. 42 (1914), p. 48. 


r 
8 


436 D. G. BOURGIN. 


2n-1 2N-1 N-1 2N 
| N Ken” | = 2 | Cr| + fe 


2n-1 


2n-1 2N-1 2N-1 one 1 


2n-1 2n-1 2n-1 
(34) = | hm’ — hm" | for n<N 
where 
Bmr == K m&mr- 
Evidently 


0 = D/L —vw. 


Thus for m fixed {km} is a uniformly bounded non-decreasing sequence, and 


has a unique limit value k». Therefore 


(34. 1) | ken — | = | hen he |. 
Furthermore 
| km | = kemN = | ha | 
(34. 2) 
| em | = | hm |. 
Clearly, 
(35) | | D/m'*6 


where 1 > 8 > 0 and the minimum value for D depends on «. We have then 


(36) 2 < Ks (a) m? + p*)? 
m? dp 
for 
ap dp 1 
Now 
We have 


Hence the right side of Eq. 33 is inferior to 


D 
mio 1 — K, (c) = ( cos E/m ° 
2 


(37) 


The foregoing argument involves 


) 


a 
( 
I 

ni 
T 
fo 
th 
We 
wl 
(4 
wi 
an 
ity 
refe 
ace 
for 
of 


THE CLAMPED SQUARE SHEET. 437 


71+é6 
(37. 1) 


2 
This is satisfied by 1>8>0 for o<5/%.3. Hence from Kgs. 34.2, 35 
and 37.2 
km < E/m**6 
(38)*° | am |, | |, E/m'*, 1>8>0, o < .685. 


For o = 4 a rough calculation yields § = .272. 

For utility in actual computations we replace km, km™ by quantities which 
can be calculated conveniently. We introduce the following system of domi- 
nant equations which can be solved in closed form 


m) Thmx/2 


(39) lm —b’ 
The dominant property holds provided D’ be taken large enough and 
b’ > (m? + = bm? (m? + 17)? 


for then the Neumann expansion for /m is surely greater term by term than 
that for km. Since 


> (m? + 1?) Th m*> (m? + 1?) | 
we require 

b’x = b 1 

4m \8m 8 Sh mr 
which is satisfied by 

b’ = b/2. 
With this value for b the solution for J» is 
br 

40) 


which, according to Eq. 38, is valid for o < .685. On combining Eqs. 34. 1 
and 40 convenient majorants for the errors (lm — hm), in the N-th approxi- 


mation, are available, viz. : 


* Since k,, > 0 and 
O(log m/m*) 
= 0 (log m/m 
f, p(p? + m?)? 

it may be shown that 6 < 1 in Eq. 38 even if Eq. 35 is replaced by |C,,, | < D/m*. This 
refers to the bound for k,,- However, there is a remote possibility that on taking 
account of the relation 0,/C,, <0, m > 1 that 6 > 1 is available for bounding | h,, | 
for some ¢, The point of this observation lies in the then consequent direct availability 
of Eq. 7.2 instead of Eq. 7.21 in the sense F. 


r 


438 D. G. BOURGIN. 


(34. 3) (1 — — | | — | 

Numerical applications. For many materials o—4. As illustration of 
the paragraphs immediately preceding as well as the technological interest 
of this case we summarize some rough calculations (for o +4). The fourth 
stage finite segment equations are obtained from Eq. 19. 4; 


.84195h;* — .02323h,* — .005728h;* — .0021683h,4 = + 1.20707 


— .06367h,* + .55175h3* — .02754h;* — .01325h,4 =— .498 
— .02616h,* — .03589h;3* + .57575h;* — .0226052h,4 = — .09065 
— .01386h,* — .03091h;* — .03165h,4 + .58586h,4 .033801. 


The values for h,*, h3* and h;* are about 
— 4.058, h,* h,* .0075. 


Manifestly, the best bounds of accuracy for hm *), consequent on Kq. 34.1 
are obtained for n= N=—4. For the purposes of this paper, however, we 
may content ourselves with n = 2 (and the slide rule). It is easy to show 
that for o = 4, D’ may be taken as 2.54. From Hq. 39 we get 


(42) 5871,? — .1981,? = 2.54 
— .0661,? + .891,2 = .925 
l,? == 4,7, == 1,3. 
Kq. 40 yields 
(42. 1) 1,92. 


Hence the error in h,’, for the two stage calculation ~ 1 and that of h;' ~.6. 
The values obtained in Eq. 41.1 involving a fowr stage approximation are 


better. Since In —=—— > |hm| it is easy to see from the expressions for 
m 


u, v, T;, T2, and S, that h; and h, alone need be retained for most techno- 
logical applications. This simplification enables us also to conveniently 
determine the general features of the pressure variations by substituting the 
reduced expressions for 7’ and Uz, in Kq. 6. 

We proceed to a derivation of the c, p relation consistent with the theory 
presented in this paper. This relation is the whole purpose of Foppl’s analysis 


and he obtains 
(43) c = .802(ar)*/(p/Erh)'” Cf. Eq. 11 
where 2a, is the side of the square. 
p replaces Foppl’s P for our treatment and is the mean pressure avel- 


h 
aged with respect to vertical displacement. That is to say, 


pe 


THE CLAMPED SQUARE SHEET. 439 


ar aT aT av 
(43.1) Work = f f pz dady =p J z dady. 
0 0 0 0° 


Combining Kqs. 6 and 43.1 there results, after straightforward evaluation of 
integrals and the choice o = 4 


8 


8 mar 
m?(m* — 16) m*—16 
mar 
Th? — 
2 
(m? + 4) (m* — 16) 2 m*—16 | 


A rough calculation indicates that the predominant contribution in the 
main bracket is that of — 3xr°/4 and that the terms under the summation 
sum to about —10. Since cube roots are taken below, the contribution of 
these summation terms has very little influence. That is to say the values 
assigned hm may be varied considerably without appreciably affecting the 
numerical coefficient in Eq. 44.2 below. We have 


(44. 1) cd? = ( 10) = (1 —o?) 
(44. 2) c= 


To compare this with Foppl’s value we note the relations 


= (p/h)r, Op = Ar /2, = A(1—o?). 
2\4/8 
Since (2) 1.4~ .8 + we have 
Tv 
(44. 3) C ~ (p/Erh)”. 


There is almost perfect concordance between Foppl’s result and our own. 
In view of the insensitiveness to the values of hm, this, in itself, has com- 


paratively little significance in justifying his approximations. Cf. Eq. 28. 


UNIVERSITY OF ILLINOIS, 
URBANA, ILLINOIS. 


: 
(44) — 32a*ap/cd? = 4 +82 m(m? — 4) + She 2 


e 
q 
ir 
)- 
y 
18 


TUBES AND SPHERES IN n-SPACES, AND A CLASS OF 
STATISTICAL PROBLEMS.* * 


By Harotp 


1. The geometrical and the statistical problems. With reference to a 
curve C with continuously turning tangent in a metrical space of any number 
of dimensions, we define a tube as the locus of points at a fixed distance 6, 
called the radius, from C, the distance being measured in each case along a 
geodesic perpendicular to C. A sphere or geodesic sphere is of course the 
locus of points at a fixed geodesic distance from a given point. Lengths and 
areas of geodesic circles on a surface have been investigated by Bertrand 
and Diguet,? who obtained the first two non-vanishing terms in the expansion 
in powers of the radius. We shall generalize this result for spheres in 
n dimensions in § 5, and for tubes in § 6. The first term in such an expansion 
is independent of the curvature properties of the space and of the curve, and 
may therefore be found from the case of euclidean space. We shall see that 
alternate terms in the series vanish. The problem is then to express the 
others in terms of known invariants; this will be done for the first non- 
vanishing terms following the euclidean ones. We shall also find exact and 
simple expressions for the volumes enclosed by tubes in euclidean and spherical 
spaces. In both these cases the volume enclosed is exactly the product of the 
‘length of the curve by the (m—1)-dimensional area of a cross-section. 
The qualification must however be made that overlapping regions must be 
counted with their appropriate multiplicities. A necessary condition for non- 
overlapping will be obtained. We shall confine our consideration to spaces 
of positive definite distance elements. 

A special type of normal codrdinates associated with the arbitrarily given 
curve is introduced in §6. These may prove useful in a variety of geometrical 
and physical problems. 

Tubes on a hypersphere play a part in theoretical statistics. For example, 
if a set of observations 

V2, °° * Un 
iy Yoo Yn 


* Received August 22, 1938. 

* Presented to the American Mathematical Society, September 6, 1938. 

* Journal de Mathématiques (Liouville), Ser. 1, vol. 13 (1848), pp. 80-86. L. P, 
Eisenhart, Differential Geometry (1909), p. 209. 


440 


i 
is 
0 
Ir 
Yo 
a 
t 
pe 
or 
of 
dr 
W 
th 
hy 
obs 
hy 
gre 
ig 
80 
f( 
ot 
Is | 
ent 
of 
(1. 
det 
of 
ma 
int 
|| 


TUBES AND SPHERES IN n-SPACES. 441 


is used to determine the parameters b and p in the regression equation 


(1.1) Y = Of(z, p) 
in such a way that 


is a minimum, then the correlation R between the fitted values and the ob- 
served values y (calculated without elimination of the mean) is the cosine 
of the angle made by two lines through the origin of cartesian codrdinates 
in euclidean space of n-dimensions, one line through the point of codrdinates 
ya, the other through the point of codrdinates Yg (a=1, 2,---,mn). On 
the hypothesis that the yg have no real relation to the zg, but are normally 
and independently distributed about zero with a common variance, we may 
regard the y-line as drawn to a random point on a unit hypersphere, on which 
there is a uniform distribution of such points. The other line is drawn to a 
point on the curve whose equations in terms of the parameter p may be taken as 


Ya =f (2a; (a=1,---,m). 


or the functions on the right may be multiplied by any constant. The method 
of least squares is such that R is made a maximum; consequently the Y-line is 
drawn through a point of the unit hypersphere lying on the curve C into 
which the foregoing curve is projected from the center, in such a way that 
the geodesic distance between the intersections of the two lines with the 
hypersphere is a minimum. The probability that R exceeds any assigned or 
observed value is the ratio to the whole (n —1)-dimensional “area” of the 
hypersphere of that portion of it contained within a tube about C. If C is a 
great circle the solution of this problem is known; this is the case if f(z, p) 
is a linear function of p. C is also a great circle if, when x is replaced by 
some function of a new variable é and p by some function of a variable 7, 
{(z,p) reduces to a linear function of +. In other cases C will be a curve 
other than a great circle. The determination whether an observed correlation 
is significant then requires the evaluation of the volume of a tube about C. 

Similar considerations apply to other situations in which a parameter 
enters in a non-linear fashion. For example we may fit a regression equation 
of the form 


(1.2) =a + df(a,p), 


determining a, b and p so as to make 3(Y —y)? a minimum. The reduction 
of its theory to that of (1.1) is effected in the following manner, which 
may easily be extended to other cases in which additional parameters enter 
into the regression equation linearly. Put 


13 


a 

r 
a 
le 
d 
d 
n 
n 
d 
t 
e 
( 
] 
8 


442 HAROLD HOTELLING. 


f(p) = 3f (aa, p)/n, = 
f'(ta,P) =f(tap)—f(p), 
The expression to be made a minimum, 
=> {a+ Of (ta, p) — ya}’, 


when expressed in terms of the quantities just introduced, reduces easily, since 


=f’ (ra, p) = 0 = Syd’ 
to 


{bf’ (wa, p) — ya’}* + n{a + bf (p) — 9}. 


The minimizing of the first’ sum is of the same nature as in the case of the 
regression equation (1.1); when b and p are determined in this way, a is 
determined immediately so as to make the other term vanish. The distri- 
bution of the correlation between y and Y, eliminating the means in this case, 
is now determined by the volume of a tube on a unit hypersphere of (n — 2) 
dimensions in the flat space of (n — 1)-dimensions whose equation is SY, =0. 
The axis of the tube is the curve C whose equations are 


a = f’ (Xa; p) (a = 


where yp is determined by the’ condition that SYa? = 1. 

The numerical process of fitting regression equations non-linear in para- 
meters is considerably more laborious than in the linear case. It should be 
noticed that in all such problems, while transformations of parameters and 
also transformations of independent variates are permissible, it is not per- 
missible to make a transformation of the dependent variate y without changing 
the hypotheses underlying the application of the method of least squares to 
the particular case. Thus, the common practice of taking logarithms of both 
sides of such a regression equation as Y = be? in order to reduce it to linear 
form leads to inexact results unless the errors in log y, rather than in y itself, 
can be regarded as normally distributed with a common mean and variance. 

As generalizations of (1.1) and (1.2) we may consider regression equa- 
tions involving two or more parameters in an essentially non-linear fashion. 
Outstanding among these are the harmonic of undetermined period, 


(1. 3) Y=—a+ bcos (kt+.e), 


or more generally, a sum of such harmonics, and the logistic used to describe 
the growth of populations and of individual organisms, 


b 


| 
( 
1 

( 
( 

is 

0 

T 

{ 

W 
$0 

be 


be 


TUBES AND SPHERES IN n-SPACES. 443 


It has not always been realized that periodogram analysis, at least in 
Schuster’s original sense of fitting a harmonic of the form (1.3), is essen- 
tially a problem in least squares, and that the problem of significance is a 
special case of the general one of least squares. The only published exact test 
of significance is due to R. A. Fisher * and is predicated on the assumption 
that only those periods are to be considered that are submultiples of the 
whole range of observations available. Empirical scientists in search of 
periodicities in sunspots, light variation of stars, rainfall, and business fluc- 
tuations have however not confined themselves to such a limited set of trial 
periods. ‘The procedure is rather to try a very large number of periods, 
perhaps greater than the number of observations, and select the one showing 
greatest intensity. ‘This is virtually equivalent to solving by trial the normal 
equations corresponding to (1.3). The maximum intensity obtainable, divided 
by the mean square residual, will be a function of the correlation R between 
the observed values y and the values Y computed from the regression equation 
(1.3). The probability distribution of R in the absence of genuine periodicity, 
on the assumption of normally and independently distributed observations 
with a common mean and variance, may be found approximately for high 
values of R by the geometrical method. Indeed, by applying to (1.3) the 
same considerations by which the theory of (1.2) was reduced to that of 
(1.1), we arrive at the equations 


(1.5) pY a (La, k, €), (a= 2) 


satisfying the conditions SY¥,—0 and 3Y,2=—1. The right-hand member 
is simply the difference between cos (ava + €) and the mean of this expression 
for the various values of zg (in applications, the times) corresponding to the 
observations. We may regard (1.5) as the equations of a two-dimensional 
surface with parameters k and «¢, lying in the (m— 2)-dimensional hyper- 
sphere whose equations are 


$¥,—0, 1. 


The probability of any particular value of R being exceeded is proportional 
to the volume of the hypersphere within a geodesic distance 6 of this surface, 
where 2 = cos 6. If we confine the range of periods, that is, of values of k, 
80 that the corresponding portion of the surface does not have too great 
curvatures, and if @ is not too great, it is evident that this probability will 
be exactly or approximately proportional to the area of the portion of the 
surface explored. 


*“Tests of significance in harmonic analysis,” Proceedings of the Royal Society, 
London, vol. 125 A (1929), pp. 54-59. 


ce 

he 

is 

rl- 

$e, 

2) 

0. 

id 

to 

th 

ar 
lf, 


444 HAROLD HOTELLING. 


The method suggested by Fisher is equivalent to using a finite number, 
approximately (n—3)/2, of great circles on the hypersphere, at constant 
mutual geodesic distances from each other of a quarter of a great circle. Of 
these circlés, the one nearest the sample point corresponds to the period of 
maximum intensity. The probability appropriate to a test of significance by 
this method is the ratio to the whole (n—2)-dimensional volume of the 
hypersphere of the sum of the volumes of all the tubes about the selected 
circles, of radii equal to the minimum distance from the sample point. The 
aggregate volume of all these tubes will evidently be less than the volume of 
the region within geodesic distance 6 of the surface (1.5), which passes through 
the axes of the tubes. This merely means that the method allowing selection 
of any period in a continuous range gives a greater probability of a particular 
value of R being exceeded than does Fisher’s method of confining attention 
to certain predetermined periods, as was to be expected. Also, if the critical 
probabilities are made equal for the two tests, some intensities significant by 
Fisher’s method will not be significant by the method of continuous variability 
of period; while periods eliminated from consideration by Fisher’s method 
will sometimes appear significant when they are admitted to consideration. 

The logistic (1.4) may be dealt with similarly by finding the area of a 
surface of two dimensions in a hypersphere. But in this case the assumption 
of equal variances of the deviations for different values of y becomes question- 
able, and a transformation leading to a different form of the problem will 
usually be suggested by the application to be made. The logistic (1.4) 
satisfies the differential equation 


(1.6) —a—vy, 
where b’ = a/b. The assumptions ordinarily underlying the use of the logistic 
as a growth curve are more in keeping with the assumption of independence 
and uniform variance for the deviations between the two members of this 
differential equation than for the deviations between the members of (1.4). 
The parameters of (1.6) enter in a linear fashion, so that in its fitting 
classical methods are more appropriate than the relatively complex ones ass0- 
ciated with the direct fitting of the integrated logistic equation (1.4), 
provided suitable estimates of the growth rate on the left of (1.6) are avail- 
able. One method of dealing with this situation has been given by the author 
in an earlier paper.* An analogous method based on a difference equation 


*“ Differential equations subject to error, and population estimates,” Journal of 
the American Statistical Association, vol. 22 (1927), pp. 283-314. 


| 

‘ 


TUBES AND SPHERES IN n-SPACES. 445 


instead of a differential equation, had been given earlier by G. U. Yule;° 
it appears to be the better of the two from a practical standpoint when, as 
Yule assumes, the time intervals between observations are strictly uniform. 

But in this paper we shall not deal further with problems involving more 
than a single non-linear parameter, nor shall we discuss the integrals whose 
evaluation is necessary for practical work with the examples indicated above. 
The subsequent sections are offered purely as contributions to geometry, except 
that the results of Section 3 are essential to the tests of significance just 
described. As is usual in differential geometry, we shall assume the functions 
involved to have in the neighborhoods concerned continuous finite derivatives 
of all orders essential to the argument. Latin indices will be used to indicate 
the values 1, 2,--+,m, whereas Greek indices will take only the values 
2,: +, throughout the paper, except in § 3. Repetition of a Greek index 
within a term will denote summation from 2 to n; of a Latin index, sum- 
mation from 1 to n. 


2. Tubes in euclidean space. In terms of cartesian codrdinates 2, 
Z," * *,@n let the curve C be defined by the equations 


(2.1) = fi (v1), 


where v, is the distance along the curve from some fixed point. We shall 
use primes to denote differentiation with respect to v, Denote by Ai; the 
unit vector tangent to C, by Ai2 the unit first curvature vector of C, and by 
his* * *,Ain a set of unit vectors orthogonal to each other and to Aj; and 
Miz, 80 chosen that the determinant | Aij|—=-+ 1. Then Au —f,’(v1) ; also 
Min equals its cofactor in the determinant. 


Introducing curvilinear coérdinates v2,° Un by means of the 
relations 
(2. 2) = fi(v1) + Varia(V2), 


where the last term represents a sum from 2 to n with respect to a, we have 


A A cA 
0(2,,° 1a Va 12 1n 


Un) 


4 
Ani + Ana Va An2* * * Ann 
Expanding with reference to the first column we obtain, since the cofactor of 
the j-th element in this column equals Ai, 


(2.3) J = 1+ 


‘Journal of the Royal Statistical Society, vol. 88 (1925), pp. 1-58. 


eT, 
nt 

Of 

of 

by 

he 

ed 

he 

of 
gh 

on 

ar 

on 

al 

y 

ty 

| 
mn 
n- 

ill 
r) 

is 

—— 
1g 

); 

or 

of 


446 HAROLD HOTELLING. 


Upon differentiating the orthogonality condition AiiAia = 0 we obtain: 


(2. 4) NirrAia’ Aia = 0. 


The elementary relation between the principal normal, radius of first curva- 


ture p,, and rate of change of the direction of the tangent may be written 


2.5 A’ =—. 
(2.5) 
Substituting this in (2.4), making use of the orthogonality of the vectors, 


and substituting the result in (2.3) gives: ® 


V2 
2.6 J =1——. 
(2.6) 
It is clear that v2,- - +,vn are distances from the curve ( in directions 


perpendicular to the tangent and to each other. A tubular hypersurface of 
radius 6 therefore has the equation 


Vo? + ug? +: = 


This may also be regarded as the equation of a hypersphere in space of n —1 
dimensions. Upon integrating (2.6) with respect to v2,:--,Un over the 
interior of this sphere, since the mean value of vg is zero, we obtain merely 
the volume enclosed by the sphere, namely 


ar 


(2. 7) r(2+*) 


Since this does not involve v,, the tubular volume corresponding to an are is 
the product of (2.7%) by the length of the are. 

This result is exact, but takes no account of overlapping of the tube 
with itself. Overlapping may be of portions of the tube corresponding to 
non-consecutive arcs, or it may be a local phenomenon resulting from the 
curvature of the axial curve being excessive in relation to the radius of the 
tube. The first kind is not within the domain of differential geometry, and 
apparently nothing can be said about it without some further specializiation 


of the curve. 
The second kind of overlapping, or kinking, will occur if and only if 


J =0 at some point within the tube. Since | v,| < 6 within the tube, it is 


* This method of evaluating J, which is simpler than my original reduction and 
does not require continuity of derivatives of the vectors of orders higher than the first, 
was pointed out by Dean L. P. Eisenhart, to whom I am indebted for reading this paper 
and making several suggestions. 


] 
( 
a 
( 
I 
( 
it 
W 
( 
0 
0 
| 
W 
t 
rf 


rs. 


ns 
of 


TUBES AND SPHERES IN n-SPACES. 447 


evident from (2.6) that J vanishes within the tube if and only if 60> . 
Thus the condition for non-overlapping of the local sort is that the radius of 
the tube shall not exceed the radius of first curvature of the axial curve, 
regardless of curvatures of higher order. 


3. Tubes on a hypersphere. In terms of cartesian codrdinates 21, 22 
tn in euclidean space of n dimensions we may write the equation of a 
unit hypersphere 
(3. 1) = 1. 


In this section it will be convenient to use this type of notation, denoting 
summation from 1 to m by the sign = and frequently omitting subscripts. 
Let the curve C on this sphere be defined in terms of the arc length s by the 
n equations 2; = 2;(s) ; let differentiation with respect to s be indicated by 


primes ; and let a, = Then 


(3. 2) xa? = I, 
and, by differentiation of (3.1), 
(3.3) Sra = 0. 
Differentiating the last equation and using (3.2) we have: 
(3. 4) sta’ = — 1. 
The radius p of first curvature relative to the euclidean space is given by 
(3. 5) 1/p? = Sa”. 
If the n new quantities é,,- - -,é&, are subject to the three equations. 


it is evident that they have n —8 degrees of freedom for each value of s. Hence 


we may write them as functions, 
(3.7) = (5, * bn-s), 


of forms to be specified later. Restricting the x; to be cartesian codrdinates 
of a point on the curve C, and therefore functions only of s, we shall use 
Yn as cartesian codrdinates of a general point on the hypersphere, 
whose equation Sy? = 1 is satisfied identically by the expressions 

(3. 8) yi =x; cos 0+ & sin 8, 

because of (3.1) and (3.6). As curvilinear codrdinates on the hypersphere 
We shall use s, 6, and the variables ¢,,- - -¢n-3 appearing in (3.7). Taking 
them in this order and using primes to denote partial differentiation with 
Tespect to s, we have as the matrix of coefficients of the linear element, 


’ 
1 
le 
y 
0 


448 HAROLD HOTELLING. 


(3. 9) | a dy dE 
| sin 6 Sy’ sino 3 «gin? @ 
From (3.6) we have 
(3.10) —0, 36’ —0, nil), — 0, 


(y=1,---,n—8). 

From these and the preceding identities it is easy to see that the elements in 
the second row or second column of (3.9) are all zero, excepting the element 
in the intersection of the second row and column, which equals unity. This 
shows, by a well known theorem (RG,’ p. 58) that 6 measures the distance 
from C along geodesics of the hypersphere perpendicular to C. 

Denoting by H the element in the upper left-hand corner of the matrix, 
we have from (3.8) and (3.2), 


(3.11) E = cos? 6+ 2 cos 6 sin 6 Saf’ + sin? 6 Sé”. 


The other elements in the first column are given by 


sin 0 3(a cos + sin 8) = sin? 6 


by (3.10). The element of (n—1)-dimensional volume on the hypersphere 
is the product of ds d0dq,,- - -,dgn-s by the square root of the determinant 
of (3.9). It therefore equals 


(3. 12) sin”? VG ds dén-s, 
-where 
E sin 6 id, sin 6 


This determinant will be evaluated with the help of a special orthogonal 


late 


| 42 
| 
| [ 
| 
q 
t 
( 
W 
té 
( 
m 
( 
M 
(3 
of 
Teg 
ali 
the 
(3 
: Wwe 
| (8. 


TUBES AND SPHERES IN 71-SPACES. 449 


ennuple of vectors of the Schmidt type.? Denoting the i-th component of 
the j-th vector by Ai; (1, j, =1,°°°,n) we put Aj =, and then define 
Mi2s’ * *> Ain as linear functions of x; and its successive derivatives with respect 
to s, such that each vector in the sequence involves a derivative of order higher 
by unity than the preceding vector, and such that the whole set is orthogonal 
and normal. Thus Aiz = a;; and 


(3. 14) SAixAim 


where 5n* is the Kronecker delta, equal to unity if k = m, and otherwise zero. 
The formulae FG (32.16) analogous to those of Frenet and Serret give in 


this case 
Xi, 5+ Xi, j- 
(3, 15) = (t, 
Pi Pj-1 
where the convention is made that = 1/pn = 0 and pi,° pn-1 are cer- 


tain functions of s. (These d’s and p’s are different from those of other 
sections of this paper.) Putting 71 in (3.15) and recalling that Aj,’ = 
= ai, and that also equals a;, shows that p; = 1. 

The quantities z; defined by the orthogonal transformation 


(3. 16) 2j = 
must, according to (3.6), satisfy the conditions 
(3. 17) 0, 1, 


Multiplying (3.16) by Am; and summing with respect to j, using (3.14) and 
(3.17), and then changing m to 1, gives 

We may regard z;,°--°,2n as cartesian codrdinates of a point independent 
of s on the sphere (3.17), whose intrinsic dimensionality is n —3. We shall 
regard aS spherical polar codrdinates on this sphere, thus speci- 


alizing the arbitrary functions (3.7), which now take the form (3.18), where 
the ’s involve only s and the 2z’s involve only the ¢’s. Putting 


Cz; 
3, 
we therefore obtain by differentiating (3.18), 
0g 
3. 20 ate 
( ) ib, jUjy 


"L. P. Eisenhart, Riemannian Geometry, Princeton, 1926, Sec. 32. We shall refer 
later to this treatise as RG. 


t 


450 HAROLD HOTELLING. 


and also, with the help of (3.15), 


Pj Pj-1 


Separating this into two summations, putting 7+1—hk in the first and 
j—1=hk in the second, introducing the notation 
(3. 22) Ay = — ——, 

Pk-1 Pk 


and making use of the relations z; = z2 = 1/pp = 1/pn = 0, we find: 


(3. 23) = DrixAx. 
From (3.22) it follows that 
(3. 24) A, = 0, 
and from (3. 23), (3.14), (3. 24) and (3.20), 
(3. 25) sé’? SA? + A,? + - A,?, 
(3. 26) Sak! = Ao, 
dg 
(3. 28) Ay — 


From (3.11), (3.26) and (38.25), 


(3.29) E = (cos 6+ A, sin 6)? + (A,? +--+ A,?) sin? 6. 

Let the determinant in (3.13) be represented as the sum of two determinants, 
identical in all but the first column, in such a way that one of these determi- 
nants has as its first element (cos 6+ A. sin 6)?, and otherwise has zeros in 
the first column. We may thus write, with the help of (3.25) to (3.29) 


inclusive 


(3. 30) G =F + H?’ sin? 6, 
where 
Suir? 
. 
(3. 31) 
n-3 
and 
A;° 
Uz1° * Uni 
Uz, n-3° ° ° Un, n-3 


Differentiating the last of (3.6) gives 3’ —0. If in this we substitute 
(3.18) and (3.23), and use (3.14), we obtain 


r 
h 
—| an 
Sq 
(3 
Th 
eq 
Thi 
the 
the 


TUBES AND SPHERES IN 7-SPACES. 451 


= 0. 
Differentiating =z? = we have, from (3.19), 
= 0. 


These equations establish a homogeneous linear relation among the columns 
of H. Hence H ~0, so that G=F by (3.30). Denoting by @’ the deter- 
minant in the right-hand member of (3.31), and noting from (3.22) that 
A, = — Z3/p2 we thus obtain the element of volume (3.12) in the form 


(3. 32) sin"* (cos — sin 0/p2) VG’ ds d0 dd: 


The integral of \/G’ d¢,- - - dén-s over the sphere is simply the (n — 3)- 
dimensional volume of this unit sphere, namely 
(n-2)/2 
n—2\- 
In integrating (3.32), the integral resulting from the second term in the 
parenthesis vanishes because z; is measured perpendicularly from a diametral 


plane of the sphere, and so has a mean value zero. Integrating also with 
respect to @ and s, we have the simple result: 


The volume enclosed by a tube of geodesic radius 6 on a hypersphere 
having intrinsically n—1 dimensions is the product of the length of the 
axial curve by 

ar ("-2)/2 ginn-2 @ 
T(n/2) 

Local self-overlapping will exist if (3. 32) vanishes within the tube. This 
will occur if and only if tan@> ps, where 6 is the geodesic radius. To 
evaluate p. we first put 7 = 2 in (3.15) and deduce; since Aj; = Xi, Aiz = Hi 


and p; = 1, that 


riz (xj + a’). 


Squaring, summing with respect to i, and using (3.14), (3.1), (3.4) and 
(3.5), we find 


The condition tan 6p. for absence of local self-overlapping is therefore 
equivalent to 

sind <= 
This condition is also expressed by the statement that the geodesic radius of 
the tube must not exceed the maximum radius of geodesic curvature of C if 
there is to be no local overlapping. 


= 
= 


452 HAROLD HOTELLING. 


As an application, we observe that in fitting the regression equation 
Y = ber 

we obtain a curve which, for p= + «, has ends. At these ends, the radius 
of curvature becomes zero. Consequently, if the foregoing proposition re- 
garding the volume of a tube is to be applied to evaluate the goodness of fit, 
it is necessary either to confine attention to values of | p | less than some upper 
limit, or to make a special study of the volume in neighborhoods of the ends 
of the curve. In the former case the volumes of hemispherical caps over 
the ends should be added to that enclosed by the tube in determining the 
relevant probability. 

4. An orthogonality property.* The following theorem concerns geo- 
desic spheres in an arbitrary Riemannian space; it reduces to one of Gauss 
when the space is of two dimensions, and may be proved in a somewhat 
similar manner: ® 

The geodesic sphere defined as the locus of points at a fixed geodesic 
distance from a point O is perpendicular to the geodesics through O. 


The differential equations of the geodesics in terms of the arc length s 
are (RG, (17.8)): 


2 
(4.1) dx ( « ) dx dz 


ds? (jk ds ds 


Taking the geodesics through O as the codrdinate lines along which 2’ =s 


is the distance from O and the other codrdinates 2?,- - -, 2” are constant, we 
have 
dai 
ds 
Substituting this in (4.1) gives 
| 


Hence [11,a] 0. Since the choice of-codrdinates implies that gu =! 
identically, it then follows that 


0910 
(4. 2) = 0. 


Consider also another system of codrdinates y',- - -, y” such that y'=é 7’; 
where the are any functions of z?,- - -, 2” have finite derivatives in a neigh- 


*In this and the following sections the notation is throughout that of RG. Latin 
indices will in all cases vary from 1 to n, Greek indices from 2 to n. 
*L. P. Eisenhart, Differential Geometry, Boston, 1909, p. 207. 


} 


tin 


TUBES AND SPHERES IN n-SPACES. 453 


borhood of O. Denoting by g’jx the components of the distance tensor in this 
coordinate system, we have 


dy) dy* 


Thus at O, where z1 = 0, gia = 0. Since (4.2) shows that gia is independent 
of z', it follows that gia = 0 everywhere. This proves the theorem. Another 
proof of this theorem, based on the transversality condition of the calculus of 
variations could also be given. 


5. Spheres in a general curved space. Let z1,--- , a be normal 
codrdinates with origin at the center 0 of a geodesic sphere of radius 6. The 
element of volume is 


(5. 1) Vg da'- ‘az, 


where g is the determinant of the distance tensor gij. For a point on the 
sphere let €*,- - -, &" be defined by the equations 

(5. 2) = £19, 

which are also the equations of the geodesics through 0 if 6 is regarded as a 
parameter and the é‘ as constants. The é* may be regarded as cartesian coor- 
dinates of a point on a unit sphere in euclidean space of n dimensions. Let 
Mn{¢@} denote the mean value over this (~—1)-dimensional sphere of a 
function ¢. Obviously M,{1} =1, and M,{é}—0. From considerations of 
symmetry it is further obvious that the mean value of the product of any 
powers of the é' vanishes unless each of the é enters into the product with an 
even exponent. By integration with respect to spherical polar coérdinates, or 
in various other ways, it is easy to establish that 


3 
n(n + 2)’ 
and at the same time, that the (n —1)-dimensional volume itself is 


Qar n/2) 


Aas 
n 


At the origin of normal coérdinates, gi; = 8+; and consequently g = 1. 


(5. 3) Mn{ } Mal 


(5. 4) 


Upon expanding Vg in a series of powers of the 2‘ and substituting from 
(5.2) we have therefore 


The (n —1)-dimensional volume of the sphere is found by integrating this 


t, 
Is 
ar 
at 
ic 
§ 
ve 
6? 


454 HAROLD HOTELLING. 


expression over the unit euclidean sphere and multiplying by 6"*. In this 
process all terms of odd order in @ vanish, since they are multiplied by odd 
numbers of the é. We thus obtain 


(5.6) An-10 |, 


In terms of normal codrdinates we have from RG (18.8) and ei 9) 


(5. 7) 
m a} m | 
(5. 8) dak = 0, 


and further identities which we shall not use here, since we shall limit our con- 
sideration of the series (5.6) to evaluating the second term in invariant form. 
The Ricci tensor, RG (8.14), is 


\ 


From RG (7.9), namely, 


we have 
k a! k } 
PlogeVg kif 


Putting m =k in (5.8), summing for k, and using (5.11), gives 


| Plog Vg V9 
5. 12 


From (5.9), (5.12) and (5.7), 


@ log Vg 
(5. 13) dxt Oxi |. 


since (Vg)o—1 and, by (5.10) and (5.7), (@Vg/dat)o = 0. The scalar 
curvature of the space is defined as 


(5. 14) R= giRiy. 


4 
| 
0 
tl 
0. 
L 
P 
(( 
tal 


lar 


TUBES AND SPHERES IN n-SPACES. 455 


Since (g*/), = the value taken by R at O is 3(Rii)o. Combining this 
result with (5.13) and (5.6), we have for the (n—1)-dimensional volume, 
Ro 
6n 
where Fy is the scalar curvature at the center. Integrating with respect to 
we have as the n-dimensional volume enclosed by the sphere, 


5.15 was 
(5.15) 


The case n = 2 of these results is due to Bertrand and Diguet. Their results 
are obtained by putting n = 2, Ry = — 2K, where K is the Gaussian curvature 
of the surface, in (5.15) and (5. 16). 

6. Tubes in a general curved space. Referring once more to RG § 32, 
we make use of the special Schmidt orthogonal ennuple for which A‘,, = da'/ds 
is the direction of the tangent to the curve C. Hence from the generalized 
Frenet formulae 2G (32.16) for spaces of positive definite distance element, 

Pp Pp-1 
where p;,* * *.pn-1 are the successive curvatures of C, and 1/po = 1/pn = 9. 
These equations hold in every codrdinate system. 

We shall denote by 2’ the are distance along C' from some fixed point, and 
define z* at other points of the space by the condition that it shall be constant 
on every geodesic perpendicular to &. We restrict attention to a region such 
that no two geodesics normal to C at points of the region meet again within 
the region, and every point of the region lies on such a geodesic. A point P 
of this region then lies on a unique geodesic normal to the curve C; let Q be 
the point at which this geodesic meets C, and let s be the geodesic distance QP. 
Let &* be the cosine of the angle at Q between the direction QP and the vector 
Ma, of the orthogonal set described above. We define ?° the z-th codrdinate of 
P as a+ = és, The equations of the geodesics QP in terms of the arc s as 


parameter are therefore 


(6. 2) x! constant, == 


Since the vectors of the Schmidt ennuple are mutually orthogonal and are 
tangent at points of CO to the codrdinate lines (i.e., the curves along each of 


It is important to bear in mind that in this section Greek indices take only the 
Values 2,..., n, while Latin indices vary from 1 to n. 


456 HAROLD HOTELLING. 
which only one codrdinate varies), we have at points of C: 


If we substitute the equations (6.2) in the differential equations (4. 1) 

of geodesics, we obtain: 

Lt hy 

0), 

(aps 
These equations are valid throughout the region, though both the Christoffel 
symbols and the €* depend on the point P of evaluation. But at a point of ( 
the Christoffel symbols take on definite values, because of the continuity as- 
sumed at the end of § 1, while the last equations hold when any numbers what- 
ever are substituted for the €*. Since a quadratic form can vanish for all sets 
of values of the variables only if all the coefficients vanish, we must have at 


every point of C, 


t 
(6. 4) { aB = 0. 
If we differentiate (4.1) with respect to s and substitute (6.2) we obtain 


similarly the equations, valid at all points of C, 


laf lyaS | lByS 


In what follows it will be understood that the expressions considered are evalu- 
ated on C. Since (6.4) holds at all points of C, irrespectively of the value of 
x‘, the derivative of the left member with respect to x’ vanishes on the curve. 
In particular, 
laB 


(6. 6) 


From the definition of the codrdinates above it follows that the direction 
of the codrdinate line along which only <? varies must at points of C coincide 
with that of the unit vector A‘y,. Hence at such a point this codrdinate line 


must satisfy 


ds 


But such a line must also by its very nature satisfy 


sl 


I 
(¢ 
T 
be 
(6 
é In 
Th 
cal 
ds 


le 


TUBES AND SPHERES IN n-SPACES. 457 


Therefore A+», = 8‘, on C. Elsewhere A‘y, has not been defined, but for con- 
venience we define 


at all other points of the space. The covariant derivative of this contravariant 
vector is by definition 


With (6.7) this gives 


6.8 ri -{ iit. 
(6. 8) pj 
Substituting (6.7) and (6.8) in (6.1) we have, at points of C, 
69 
pi Pp Pp-1 
In particular, 
2 
6.0 
al p1 
The components of the Ricci tensor (5.9) with subscripts 2,---, 


simplify on account of (6.4) and (6.6) to the form 


Y 


The scalar curvature (5.14) may with the help of (6.3), (6.11) and (6.10) 
be expressed in the form 


(6. 12) R= Ry, + Rag 


0 log V 1 
og Vg 
= R,, + — , 


The mean curvature of the space at a point with respect to the direction 
Mi, of the curve is defined as 
= Rijd*1 
In view of (6.7) this gives 
(6. 13) R’ = Ry. 


The mean curvature with respect to any direction has the following geometri- 

tal meaning (RG, p. 113). With each of n —1 directions orthogonal to the 

given direction and to each other, the given direction determines a pencil of 
14 


| 
Y 
t 
n 
l- 
| 


458 HAROLD HOTELLING. 


geodesics forming a surface. The sum of the Gaussian curvatures of these 
n — 1 surfaces is the negative of the mean curvature of the space for the given 
direction. From this we derive a geometrical interpretation of the scalar 
curvature. Since in normal codrdinates 


and since each term on the right is, like (6.13), the mean curvature with 
respect to a particular one of an orthogonal ennuple of directions, it follows 
that —F is twice the sum of the Gaussian curvatures of all the n(n —1)/2 
geodesic surfaces determined by these directions. If further we denote by § 
the scalar curvature of the hypersurface z' = constant, we have from this inter- 
pretation that — S is twice the sum of the Gaussian curvatures of those geode- 
sic surfaces determined by the ennuple which lie in the hypersurface. From 
this it follows that 

(6. 14) R=S-+ 2R’. 


With reference to the hypersurface xz’ = constant the components gag of 
the distance tensor have the same values as for the n-space. The same is there- 
fore true of those Christoffel symbols whose indices have the values 2,- - -, n, 
and of the derivatives with respect to 2?,- - -, 2” of these symbols. We shall 
denote by h the (n —1)-rowed determinant of the gag, and by Sag the Ricci 
tensor of the hypersurface where it is pierced by C. Similarly to (5.11) we 
have: 


Plog 


(6. 15) 


If in (6.5) we replace i by y and then sum with respect to y from 2 to n we 
obtain with the help of (6. 15) 


| 
_o Flog Vh 


The Ricci tensor Sag is obtained from the right-hand member of (5.9) by 
replacing 1, 7, k, m respectively by a, 8, y, 6 and using h in places of g. With 
(6.4) and (6.16) this gives 

3 
(6.17) Sap = — 


Since S = g*®8,g we have from (6.17) and (6.14), 


Int 


( 
L 
a 
VG 
T 
(6 
It 
Teg 
by 


en 
ar 


we 


by 
th 


TUBES AND SPHERES IN 2-SPACES. 459 


4, 
(6. 18) 


Substituting this and (6.13) in (6.12) gives, after rearrangement, 


@loeVg R+R 1 
ap — 


From (5.10), (6.10), (6.4) and (6.3) it follows that 


dx P1 


Hence (6.19) gives 


V9 


(6. 20) 


For a fixed value of 2! we may expand Vg in a series of powers of 
a’,-- +,a", replace «* by é*6 to obtain a series resembling (5.5) but with 
Latin indices replaced by Greek, and then integrate over the (n — 2)-dimen- 
sional volume of the sphere 3(£*)* 1. This gives for the volume element of 
atube dé, where 


VG = Ano V9} 


+ — VS), Mn-1{&€°} +: 


The symmetry considerations of § 5 together with (5.3), (5.4), (6.3) and 
(6.20) reduce this to 


9,-(n-1) /2Qn-2 
r 
(>) 


Integrating with respect to 6 gives as the volume element of a tube of radius 8, 


(n-1)/2 (E+ PR’) & 
2 


It might have been thought on the basis of geometric visualization that this 


result could have been obtained from (5.16) by replacing n by n—1 and Ry 
by S. But R, must be replaced, not by S, but by R+ RF’ =S8 + 3PR’. 

By an extension of the foregoing procedures it seems likely that a fairly 
straightforward calculation would give the terms of these series, and of the 


Se 
/2 
8 
ym 
of 
re- 
n, 
all 


460 HAROLD HOTELLING. 


corresponding series (5.15) and (5.16), to any required degree. What is 
required is to express the symmetrical sums of higher derivativs of Vg in 
terms of invariants by formulae analogous to (6.20). Invariants available 
for the purpose are the higher covariant derivatives of the right-hand men- 
ber of (6. 20), and the contracted covariant derivatives of the Ricci tensor. It 
is a question of some interest whether the volume element of the tube also 
involves the various radii of curvature px of C. It does not involve them in 
either of the two cases we have examined fully, those of euclidean and of spheri- 
cal space. If they do enter in other cases, these will doubtless call for the use 
of (6.9), a formula whose use could have been avoided in obtaining only the 
terms found above. 

The conditions for non-overlapping in terms of the radius of first curva- 
ture found for euclidean and spherical spaces do not seem capable of generali- 
zation to arbitrary spaces. The condition applicable instead is that the radius 
of the tube shall be so small that no two geodesics through the curve and per- 
pendicular to it shall meet again within the tube. 


COLUMBIA UNIVERSITY. 


wh 


lk 

W. 
(1 
sh 
5 
| 
4 T 
ty 
(2 
wl 
in 
E, 
of 


ON THE VOLUME OF TUBES.* 


By HERMANN WEYL. 


1. The problem. In a lecture before the Mathematics Club at Princeton 
last year Professor Hotelling stated the following geometric problem’ as one 
of primary importance for certain statistical investigations : 


Let there be given in the n-dimensional Euclidean space En or spherical 
space S» a closed v-dimensional manifold Cy. The solid spheres of gwen radius 
aaround all the points of Cy cover a certain part Cv(a) of the embedding space 
E, or Sn, the volume V(a) of which is to be determined. We call Cv(a) an 
(n,v)-tube (of radius a around Cy). 


For small values of a one will have in the first approximation 
V (a) =Qna™ ko, 
where Q,a” is the volume of the solid m-dimensional sphere 


(m—=n—v), and ky the area of the “surface” Cy. Professor Hotelling 
showed that this formula is exact in #, and a similar formula prevails in 
Sn, for v1. I shall here treat the problem for higher dimensionalities v. 
The result in 2, is a formula consisting of 1+ [4v] terms, of the following 
type (§ 3) : 

6 ert 

V(a) = Om +4) 


(e even, OSeS»v), 


Where ke is a certain integral invariant of the surface Cy determined by the 
intrinsic metric nature of Cy only, and thus independent of its embedding in 
E,. I shall express these invariants (§ 4) in terms of the Riemannian tensor 
of Cy. An analogous result is obtained for Sy. 


2. The fundamental formulas for the volume of tubes. If an 
n-dimensional manifold M, consisting of points uw and locally referred to 


* Received October 14, 1938. 
*See his paper “ Tubes and spheres in n-spaces, and a class of statistical problems ” 
which precedes this article in this Journal, pp. 440-460. 
461 


is 

in 
ible 

It 
also 

in 
erl- 
use 

the 

va 
ali- 
Lius 


462 HERMANN WEYL. 


parameters u*,- -*,w" is mapped upon the Euclidean space Hn with the 
(3) tr=r(u) 


then the volume V of the image of M, in HL, may be computed by means of 


the formula 


where tn] designates the determinant of the n columns rj, each con- 
sisting of the components of the vector 

| or/du'. 


This formula takes account of the + orientation and multiplicity with which 
the mapping u—r covers the several parts of Hn. The covering will be 
locally a one-to-one mapping without folds and ramifications wherever 
[t:: -*tn] >0. But even if this condition is satisfied everywhere, multiple 
covering might occur. This question is essentially one of topological rather 
than differential geometric nature. It is with this reservation in mind that 


in the following we apply formula (4). 

When dealing with the spherical space Sn we employ homogeneous 
codrdinates (4%, %1,° *,%n) =r, the set pa; meaning the same point as %, 
whatever the factor p40. Sometimes we use the normalization 


a)? 


S,, then appears as the unit sphere in the Euclidean En,;. (4) must be replaced 
by the formula 


f 


as one easily verifies by observing the following facts: (1) the integrand is 
orthogonally invariant; (2) it is not affected by the gauge factor p=p(W) 


because 
(et)i ts 


(3) at the point r= (1,0,- - -,0) the integrand reduces to the “ Euclidean” 


value 
02, 
du’ Ou} 
0a, 
dun? 


: 

i 

| 

| 

( 


the 


18 of 


con- 


hich 
1 be 
‘ever 
tiple 
ther 
that 


eous 


3 


aced 


d is 


ON THE VOLUME OF TUBES. 463 


After these preliminary remarks I now turn to our problem in En. Let 
a piece of the v-dimensional manifold Cy be given in the Gaussian representation 


(6) -w’). 


At each point we can determine m = n — v normal vectors n = n(1),°-+,n(m) 
satisfying the equations 
ta'n =0 


which are mutually normalized by 
n(p) -1(q) = 8p¢ (p,q =1,° 


ta is the derivative dr/du*. In using the radius vector (6) and these normals, 
the part Cv(a) of the space covered by the spheres of radius a around the 
points of Cy allows the representation 


(7) r=r+tn(1) +: 


in terms of the parameters - -,w’,t:,: *,t¢m. Hence its volume V(a) 
is the integral 


(8) f ° -,%v,n(1),° -,n(m)] dt, - dtmdu' - - + dw’. 


Following Gauss we describe the surface Cy embedded in Ey by its metric 
ground form 


(dr)? = gapdurd’, (gap = tp) 

a, 

together with the linear pencil of the second fundamental forms 
— > {tp Gap(p) durdu*}, 

p=1 a,B 

which is the scalar product of 
== > rapdu*duh 


with an arbitrary normal n= #,n(1) +: -+ tmn(m). 


Gap(p) = Gpa(p) =— tap’ = ta‘ 


A Greek subscript « attached to the vectors r, n and r always denotes 
differentiation by 


464 HERMANN WEYL. 


Each vector at the point wu of Cy is a linear combination of the basic 
vectors ta, n(p). On applying this remark to tta(p) we set 


where . . . indicates a linear combination of the normal vectors n(p). By 
scalar multiplication with rg one finds 


Gap(p) = gar Gp(p). 


From (7) one infers that 


Therefore the integrand in (8) 
p 


Because of the general identity 


-an]* = det (aiax), 


-n(m)]? =| gas |, 


and considering that 
ds = | gap |? - du’ 


is the area element of Cy, one arrives at the fundamental formula 


V(a) -f} f f | + (p) | «dtm ds 


in the Euclidean case. The integrand is independent of the choice of the 


parameters on Cy. 

In the spherical case, let the manifold’ Cy be given by the parametric 
representation (6) with the normalization r?>—1. Therefore r-ta—0. The 
mutually orthogonal normal vectors 1 = n(1),°*-,n(m) satisfy the equations 


r-n=0, n= 0. 
From both equations there follows 


= 0. 


whe 


and 


T 
Wl 
th 
int 
(9 
ext 
ty, 
bef 
(1( 
des 
(11 
The 
exp 
(12 
= 
| 


ON THE VOLUME OF TUBES. 465 


The part C.(a) of the space Sn» covered by the m = (n— v)-dimensional 
solid spheres of spherical radius « is represented by 


r=r+én(1) +---+tnn(m), 


where the argument w in r,n(1),- - -,1(m) ranges over the whole Cv, while 
the parameters ¢,,° - -,¢m are bound by 
ty? ty? Sa’, (a = tana). 


According to equation (5) the volume V(a) of Cv(«) is given by the 
integral of 
[tri -n(m)] - - du’dt,: «dtm 


(9) 
extended with respect to uw’, --,u” over the whole of Cy, with respect to 
ty, * *,¢m over the sphere om(a). Application of the same procedure as 


before results in the formula 


+t 
dt,- 
x 2 2) (n+1)/2 
8. Evaluation. For any function ¢(t) = let 
designate its mean value over the sphere 


The mean value </,%- - + tm®m>, of a monomial is obviously zero unless all 
exponents e, are even. In the latter case one has the well-known formula 


ds. 


12 Cyr « time™ = 
( ) < m(m + 2): + e—2z) 
even, 


where 
0)=—1, e) [for e=2,4,-- -]. 


[ (12) is most easily proved by multiplying the monomial by 


and then integrating over 


w (p =1,:--,m). 


466 HERMANN WEYL. 


One thus obtains 
+00 
0 -00 
em), 
with f - + +d; indicating the “solid angle” integration over the sphere 
(11), and hence 


m "aot = 
2 


In particular, for the surface wm = f due of the sphere, 


1 1 ii m 
Division results in the desired equation 
J 
2 
<t,% > tm®™>, = i 1 
mic}. 


~ 


The volume of the solid sphere om(a) amounts to 


a 
m-ldp = —™- « 
Wm 
0 m 


hence Qn —=wm/m. Specialization of (13) for m= 2 yields [T(4)]’=* 
The numbers wm, Qm are best defined by the recursive formulas readily derived 
from (13): 


(m=1); (o, = 2, = 2m). 
Ome = On (m= 0); (Q, = 1, 2,=2).] 


We expand the determinant 


according to degrees in the variables tm: 


We(t,- tm) = ¢e,... emei™’ +: ‘ em =e) 


On 


anc 


(1 


i 
d 
B 
fc 

th 
m( 
Tl 

| 

(1 

wi 

= 


ON THE VOLUME OF TUBES. 467 


is homogeneous of degree e¢. wWo—=1. This decomposition is conveniently 
described by introducing an artificial parameter dX: 


| + AD = 1+ + 
We set 4 

m(m-+ 2): -:(m+e—2)° 


By its definition, He is a point invariant of Cy. He is zero for odd e, while 
for even e one derives from (12) the explicit expression 


= 


He =D 61) * * *@m) * ber... ems (@p even, ++ +++ em—e). 


The integral over the solid sphere om(a), 


Oma) 


then will turn out to be 


Thus we find in the Euclidean case 


a” é 


+e 
qme 


(14) V(a) = Om ke (m + 2) (m+ 4) - 
(e even, OSeS»), 
with the coefficients 


(15) ke= H ds. 
Cy 
In the spherical case one gets 


Om (a) 
Om a petm-1 


m(m + 2) - (m + e—2) (1 + 


On putting r = tan p the integral at the right side becomes 


a 
(sin (cos p)”-°dp, 
0 


and instead of (14) one obtains 


(16) V (a) Dd hed e(@), (e even, OSeSSv), 


468 HERMANN WEYL. 


where 

a 
(17) m(m-+2)-- (m+e—2)Je(a) (sin p)™*¢-? (cos p)”-¢dp. 
One may notice the recurrent equation 


(sin «)°&*™(cos 


m(m + 2)---(m+e) 


THEOREM. The volumes of (n,v)-tubes in Euclidean and wm spherical 
space are gwen by the formulas (14), (16) respectwely, Je(%) being defined 
by (17). ke, (15), are certain integral invariants of Cy, in particular ky is 


= Je(a) — (v—e —1)Jern(a). 


its surface. 


4, Intrinsic nature of the invariants k,. So far we have hardly done 
more than what could have been accomplished by any student in a course of 
calculus. However, some less obvious argument is needed for ascertaining that 
more explicit form of the point invariant H, which enables one to replace the 
curvature G,8(p) by the Riemannian tensor R»yag of Cv. I repeat the 
definition of this tensor in terms of the metric ground tensor: 


Ag px 


[definition of the affine connection Igg], 


p ; 


After raising the index A according to 
ke is not only skew-symmetric in «8, but also in «A. As a part of the 


integrability conditions expressing the Fuclidean nature of the embedding 
space L,, one has the relations * 


(18) (Ge"(p) Gp (p) — Ga()}. 


In the spherical case we look upon Cy as a surface in Hn:. To the set of m 
normals n(p), (p =1,--+,m) one has simply to add n(0) =r. Since 


2? See H. Weyl, Mathematische Zeitschrift, vol. 12 (1922), p. 154. 


Ir 


ju 


mi 


E, 


on 


He 


wh 


| 
j 
8. 
t 
= 
| - 


ON THE VOLUME OF TUBES. 469 
tta(0) rq or G,f(0) = 8,8, 


(18) changes into the equation 


m 


(19) — — —2 {Ga"(p) Gp\(p) —- Ga*(p) Ga*(p)}. 


=1 


[It is a pity that the inadequate name “ curvature,” which ought to be 
reserved for Ga’(p), has been attached to the Riemann tensor. In the paper 
just quoted I proposed the more descriptive term “ vector vortex.” The left 
side of (19), and also of (18), is the excess of the vortex of Cy over that of 
the embedding space. In this form the relation would hold with an arbitrary 
embedding Riemann space.] 

We must try then to express the spherical average 


<det (848 + AX 
Dp 


in terms of the quantities 


In this investigation the 


just as 
t = tm), 


may be looked upon as arbitrary vectors in an m-dimensional Euclidean space 


Em. Using for a moment the abbreviation 


one has 


Ges « de 
Za, ? » 


Hence we try to determine 


<det (¢ Ga?) 


where (a, are any given vectors in Em. 


470 HERMANN WEYL. 


LEMMA. 


(21) <det (t- Ga*)>, 


Bi°**Be are the numbers in any two arrangements, 


3(?)—+ 1 according as the permutation carrying the a- into the p- 


arrangement is even or odd. The sum extends over all couplings of pairs 


| 


(8:82) 


By a “ pair” (a,%2) we mean here two distinct numbers wrespective 


(22) 


of their order. Indeed the term 7 ts ) under the sum > on the right side 
4 


of (21) does not change under reversal of an @-pair, (@,¢,) —> (@2a,), or of a 
B-pair. Nor does it change under permutation of its e/2 factors H; therefore 
only the coupling of the «-pairs with the 8-pairs, but not the order of the ¢/2 
blocks of the scheme (22) matters. Of the 2¢- ($e) ! equal terms arising from 


(8) by inverting any of the e pairs of indices and by permuting the e/2 


factors H, only one is retained in the sum. 
Taking the lemma for granted, we find at once 


B:Bs\ 
(23) He H H 
where the sum now extends to all couplings of pairs (22) from the larger 
range 1,2,---,v for which the £-sequence consists of the same e distinct 


figures as the a@-sequence. The invariant nature of the sum to the right is 
evidenced when we first write it as 


2¢(e/2) ! de Ay Qe As 
the inner sum alternatingly running over the permutations 1’,- - -,¢’ of 
1,---,e. The limitation of distinctness imposed upon can be 


canceled, as the inner sum vanishes if two of the @’s coincide. Hence 


T 
te 
pi 
(7 
is 
= 
| is 
th 
of 
ge 
se. 
we 
wl 
sk 
Ov 
001 
ach 


ON THE VOLUME OF TUBES. 471 


The inner sum in which each @ runs independently from 1 to y is a scalar. 
We have thus arrived at the decisive 


THEOREM. The scalar He on Cy is determined by the formulas (23), 
(24) where i) is the Riemann tensor or vortex RM in the Euclidean 


case, and the vortex excess (19) tn the spherical case. — 


These metric scalars He deserve attention on their own merits: they are 
probably the simplest and most fundamental scalars built up by the Riemann 
tensor. 

As a very special case of our theorem we find that the one term formulas 


V (a) On,a™ ko, V(a@) — o(&%) ko 


prevail if Cy is applicable on Ey or Sy respectively. k,» denotes the surface of 
Cy. Professor Hotelling’s result concerning the tubes around a curve, v1, 
is fully contained in this special case. 

The lemma is proved by an invariant-theoretic argument as follows. We 
consider the e? vectors G,° as independent variables. 


= <det Ga*)>, 


is an orthogonal invariant of these variables and therefore, according to the 
theory of orthogonal vector invariants,® expressible as a polynomial in terms 
of the scalar products (G,\- Gg). Observing that ® is linear and homo- 
geneous in the components of the vectors of each row and each column of the 
scheme 


we realize that it must be a linear combination of terms 


where the # and @ are any two arrangements of 1,---+,e. Moreover ® is 
skew-symmetric with respect to the columns. Hence, by summing alternatingly 
over the ¢! permutations of the superscripts 8 we find that ® is a linear 
combination of the following functions 


*E. Study, Ber. Stichs. Akad. Wissensch. 1897, p. 442. H. Weyl, Mathematische Zeit- 
achrift, vol. 20 (1924), p. 136. 


G,*, 


472 HERMANN WEYL. 


[6] 


9 


(B) 


The first sum runs over all ¢! permutations Bi: - «Be of 1,- - -, e, the second 
over all their e!/2°/ arrangements in “ pairs ” 


(BiB2), (BsBs), ° 


By applying the same argument to the subscripts « one concludes that ® is a 
constant multiple c of H-, (23). 
The constant c is determined by the specialization 


Gf = (8,8, -,0) 
for which 


and 


= — He = €1/2°? (fe) !—e). 


INSTITUTE FOR ADVANCED STUDY, 
PRINCETON, N. J. 


an 


of 


Pu 


Pro 
458- 


nati 


depe 


Whe: 
and 

insti 
(198 


th 
F 
It 
of 
th 
| | 
Jan 


NOTE ON POWER SERIES WITH BIG GAPS.* 
By M. Kac.t 


This note contains some results concerning the asymptotic distribution of 
z |< 1 by the power series 


the values of analytic functions F(z), represented for 
F(z) a,2", where (my ++ +++ 11) © as k— ow, and & | (lig |? 0, 
It will be proved that the distribution of values of F(z) in the neighborhood 
of the convergence circle is, in a certain sense, the normal distribution. Thus, 
the connection between gap theorems and statistics will be made clear once 
more.” 

In order to simplify the calculations one supposes that the a, are real 
and that a, = O(1). 

From the fact that a, O(1) and Sa,” = , it follows that the radius 


of convergence is 1. 


1. Let f(t) be independent functions* on the interval [0,1] and let 
1 1 

f(t) dt = 0, f fx? (t)dt =h?, |fe(t)|< A, 
0 0 


oO 
Putting M(r) = (2 a;?r")4, one has, for every integer / = 0, 
(=1 


@& 
lim M-*(r) f (> = f uledu. 
r>1 0 k=1 


-0O 


In order to prove this, notice first that the statistical independence of 
fe(t) and the condition | f,(t)| < A imply that, for every complex z, 


* Presented to the American Mathematical Society February 25, 1939. Received 
January 12, 1939. 

* Fellow of the Parnas Foundation, Lwéw, Poland. 

*R. E. A. C. Paley and A. Zygmund, “On some series of functions (1) and (2),” 
Proceedings of the Cambridge Philosophical Society, vol. 26 (1930), pp. 337-357 and 
498-474. See also N. Wiener, “Gap theorems,” Comptes Rendus du Congrés Inter- 
national des Mathématiciens, Oslo 1936. 

*The measurable functions f,(t),- defined on [0,1] are called in- 
dependent if for any system of real numbers @,,+ + +,4, one has 


k 
| E{f,(t) < a,}| | a;}|; 
j=l 


where Ef \ denotes the set of those t for which the relation inside of { 1 is satisfied, 
and | Es \| the measure of this set. For properties of independent functions, see for 
instance M. Kae, “Sur les fonctions indépendantes I,” Studia Mathematica, vol. 6 
(1936) , pp. 46-58. 

473 

15 


474 M. KAC. 


1 fe 1 
exp(zM-'(r) > ) dt = I] exp (zM-(r) aur™fx(t) ) dt. 
0 k=1 k=1 0 
Furthermore, 


1 
f ) dt = 1 +- $h?22M-? + 20 (M-? (17) 


where the o-term is uniform in k, since | f,(t)| << A. Hence it is readily 


seen that 


1 
f exp(zM-1(r) axr™ f(t) )dt exp($h?2?) as r> 1, 
0 k=1 


holds uniformly in every finite circle |z| <= R. Now, the coefficient of z! in 
the expansion on the left tends to the corresponding coefficient on the right. 
This completes the proof. 

2. Next, it will be proved that the relation, just proved, also holds if 
one replaces fx(t) by cos 2rnyt, even though the cosines are not independent. 

Let 3,(¢), be the independent functions of Steinhaus,* which 
realize the mapping of (0,1) on the infinite dimensional torus and conserve 
measure. Putting f,(t) = cos 27d;,(t), one obtains a sequence of independent 
functions satisfying all conditions of § 1 with h? = 1/2. Then 


1 @) +00 
(*) lim M~!(r) (> cos 220; (t) ) "dt = f ule“ du. 
70 k=1 


OO 
For a fixed 1, choose a in such a way that m/(m. m1) >! 
for every k > k(1). It will first be proved that 


1 1 
( cos 2arnxt) = f ( cos (t) ) 
) 0 k=k(1) 


0 k=k(1 =k 


To this end, it is sufficient to show that 
1 1 
0 0 
where p, + po+:--+pi=l, ks >k(1). Observing that 
1 Ll 
pr) ={ cos” 2rd, (4) - cos?! dt = J] cos?» dt, 


8=1 0 


one evidently has 8(p,,- - -, p.) =0, if at least one ps is odd; while 


‘H. Steinhaus, “Sur la probabilité de la convergence de séries,” Studia Mathe 
matica, vol. 2 (1930), pp. 21-39. 


an 


on 


At 


Ob 


can 


§ 

be 
sl 
p 
2: 
tex 


NOTE ON POWER SERIES WITH BIG GAPS. 475 


if all p are even. Thus, the establishing of the statement (**) depends on 


showing that 


1 
1 


S] 


Using the complex formula for cosines, one obtains 


Sy 
where the summation indices s; run from 0 to p;, and 
A= (25, — pi) mm, + (281 — pi) mx, 
Since exp(2ziAt)dt is 0 or 1 according as A 0 or A= 0, it remains to 
0 


be proved that A = 0 if and only if s,; = p,/2,° ++ ,8: = pi/2. This is obvious, 
since /(m +: m1) for k >k(l) and ks > k(l). In fact, sup- 
pose that not all 2s; p; vanish and let m be the largest index for which 
28m — Pm 0, then 


=| | 2sm— pm | Mm — 28; — pj | me, 
j= 


5 


m-1 km-4 


1 
Ou 
283 — py | m, m, ny < 
- j=1 j=1 


2Sm — Pm Ni 


and, as im = Mem, One has |A| > 0. 


4, In view of the known inequality * 


1 
f | cx cos (t) |? << c(p) (S 
one has 
1 
cos (t) |? dt << e(p)M(r). 
0 k=k(1) 
An analogous inequality holds ® for 
1 
| cos 2arngt |p dt, if p<l. 
0 


k=k(1) 


Observing that, as r > 1, 


k(1)-1 
M*(r) cos and agr™ cos 2angt 
k=1 k=1 


tend to 0 uniformly in ¢, one readily infers from (*) and (**) that 
° Loc. cit. 1), pp. 467-468. 

*This is an immediate consequence of (**) for even p. In the case p is odd, one 
can apply Hélder’s inequality. 


ly 
in 
It. 
if 
h 
j 


476 M. KAC. 


+00 


lim ( cos dt = f ule-“du. 
rl 0 k=1 


©) 


This implies, according to a well known theorem,’ that 


lim | E{ > ayer” cos << oM(r)}| = f 
k=1 


r—1 


Putting 2r/ = ¢ one has 
R(F(ret?)) = cos meh = Cos 2arnxt 
k=1 k=1 
and one arrives at the following theorem: 


lim | (re'”) ) oM (r) }| 
rl 


A similar reasoning also proves the same theorem for 3(F'(re*”) ). 
Using the same method with small modifications, one readily finds the 


sharper relation 


Lim | (F (re!) < (r),3(F(re")) < (r)}|—4e 
rl 


5. It is easy to see that if the gap condition is unaltered the above 
method permits one to obtain 
1 ™ +00 
lim m-!/2 ( cos = du, 
k=l 


m->0O 0 


and therefore 


lim | cos << wm4}| = ref edu. 
k=1 


m->OO 

It is clear that a similar theorem may be obtained for Dirichlet series 
with the gap condition considered above. 

It seems to be not without interest to investigate similar questions in the 
case of the usual gap condition > q > 1. 

Added March 22, 1939.—Professor Wintner has pointed out to me that 
one can replace the gap considerations, used above, by the corresponding con- 
siderations of Paleyand Zygmund (cf. A. Zygmund, Fundamenta Mathematica, 
vol. 16 (1930), p. 104 and top of p. 105) and that the result of the present 
paper may then be obtained without recourse to independent functions. In 
this way the gap condition + m1) ~, used above, becomes 
replaced by the condition nz/n,-;—> ©, which seems to be more general. 
Actually it is easy to see that these conditions are equivalent. 


THE JOHNS HopkKINS UNIVERSITY. 


7 Compare for instance A. A. Markoff’s book on calculus of probability, and, in the 
multidimensional case, E. K. Haviland, American Journal of Mathematics, vol. 56 
(1935), pp. 625-658. 


( 
a 
n 
ti 
t 
a 

al 
ar 
su 
le 
in 
su 
p. 
ref, 
Soc 
are 


du, 


ASYMPTOTIC DISTRIBUTIONS AND STATISTICAL 
INDEPENDENCE.* 


By Puitip Hartman, E. R. van Kampen and AvuREL WINTNER. 


Introduction. The present paper deals with the distribution problem of 
functions which are independent in the sense of Kolmogoroff * or, equivalently, 
in that of Steinhaus.? For such functions, Steinhaus has developed, in 
codperation with Kac, a valuable theory * in case the functions are defined 
on a finite interval.* However, the case which is significant in physical and 
number theoretical applications is the case of functions on the infinite range 
—a<t<-+ oo; so that the integration processes must be replaced by 
averaging processes. In this case, the theory developed * leads to certain diffi- 
culties, which will be analyzed in § 9 (and § 2 bis) below. 

The purpose of the present paper is to remove these difficulties, thus 
making the theory of independent functions applicable to the fields men- 
tioned above. This aim will be reached by establishing the connection be- 
tween the theory of independent functions on the one hand and the theory of 
convolutions ° on the other hand (cf. the theorem of § 7 below). 

In § 9, it will be necessary to refer to the simplest case of a function with 
an asymptotic distribution function, viz., the case of a real-valued uniformly 
almost periodic function f(t), <t<+ oo. If op(x), where 7 >0 
and — «0 <a2< + o, denotes the ratio of the sum of the lengths of those 
subintervals of the interval —7’=t=T' on which f(t) <a and of the 
length 27 of this interval, then f(¢) has* an asymptotic distribution function, 
in the sense that there exists a monotone function o(a), —o <a@<+o, 
such that or(x) tends, as T — «, to o(a) with the possible exception of those 


* Received February 11, 1939. 

*A, Kolmogoroff, Grundbegriffe der Wahrscheinlichkeitsrechnung, Berlin (1933), 
p. 50. 

*Cf. H. Steinhaus, Actualités Scientifiques et Industrielles (1938), where further 
references are given. 

*M. Kac, Studia Mathematica, vol. 6 (1936), pp. 46-58. 

*M. Kae and H. Steinhaus, Studia Mathematica, vol. 7 (1937), pp. 1-15. 

°Cf., e.g., B. Jessen and A. Wintner, 7'ransactions of the American Mathematical 
Society, vol. 38 (1935), pp. 48-88 (more particularly § 2-§ 3), where further references 
are given. 

°A. Wintner, Zeitschrift fiir Physik, vol. 48 (1928), pp. 148-161; Mathematische 
Zeitschrift, vol. 30 (1929), pp. 290-319. 

477 


478 PHILIP HARTMAN, E. R. VAN KAMPEN AND AUREL WINTNER. 


points x which are discontinuity points of o(a). (In particular, the set of 
those x for which lim o7(z) does not exist is at most enumerable). It is clear 
from the example of a periodic, continuous, Cantor function that lim o7(z) 
may exist for every x even if the discontinuity points of this limit function 
a(x) are dense between x = Min f(t) and «= Maxf(t). On the other hand, 
Bohr’ has constructed an almost periodic function, which actually is limit- 
periodic (grenzperiodisch), with the property that the limit of o7(2) does not 
exist at an exceptional x (It is an unsolved problem whether or not such 
exceptional x may form, for a suitable almost periodic f(¢), a set which is 
dense on an interval). Obviously, one should like to consider the functions 
f(t) and f(rt) as statistically independent if r is an irrational number and 
f(t) a limit-periodic function. 

The theory to be developed takes care of the difficulties implied by the 
above situation. 


1. Let R, denote a real k-dimensional vector space of the points 
(z1,---,2*). By < xy will be meant that < for j = 1,:--,k. 
Correspondingly, Q(a, will denote the k-dimensional interval < < bi, 
j=1,---+,k. A sequence {Qn} of intervals Qn = Q(dn, bn) will be said to 
be dense on #2, if there exists for every interval Q(a, 8B) C Re and for every 
«> 0 an n=n(a,B) such that |a—a,|<e and | <«, where | 
denotes the length of the vector c. 

By a distribution function on #, will be meant an additive monotone set 
function ¢ = $(Q), defined on a dense set of intervals Q in such a way that 
¢(R,) =1. It is well known * that ¢ can uniquely be extended to all intervals 
Q which are “ continuity intervals” of @; and that “nearly all” intervals Q 
are continuity intervals of ¢, in the sense that there exists for every ¢ a 
sequence of vectors, {yn}, in such a way that the interval Q(a,b) is a con- 
tinuity interval of # whenever A bi for j = and 
n = 1,2,---. Correspondingly, a family ¢7(Q),0< T < ©, of distribution 
functions will be said to tend to a limit distribution function ¢, if ¢r(Q) 
— ¢(Q), where 7’ — o, holds at every continuity interval Q of ¢. 

In view of the continuity theorem of Fouriez-Stieltjes transforms,’ the 
limit distribution function ¢ exists if and only if the Fourier-Stieltjes trans- 
form L(u;¢r) tends, as 7’ — ©, to a limit function uniformly in every fixed 
sphere | u| < const., in which case lim L(u;¢r) = L(u;¢). It is under- 


7H. Bohr, Danske Videnskabernes Selskab, Mathematiske-Fysiske Meddelelser, vol: 
10, Nr. 6 (1930), pp. 12-17. 

* Cf., e. g., A. Kolmogoroff, loc. cit. 1, Chap. II. 

* Cf. E. K. Haviland, American Journal of Mathematics, vol. 57 (1935), pp. 382-388. 


p 
k 
( 
W 
di 
il 
0) 
be 
n¢ 
q le 
dle 
( 
q T 
(3 
wl 
th 
th 
ur 
(. 
A 


ASYMPTOTIC DISTRIBUTIONS AND STATISTICAL INDEPENDENCE. 479 


stood that L(u;w) is defined, for every distribution function y and for every 
point w= (u',---,u*) of a real k-dimensional vector space R,, by the 
k-dimensional Stieltjes integral 


(1) L(u;w) 


where the dot denotes scalar multiplication. The value of y for a continuity 
interval Q of y is represented by the inversion formula ?® for the transform (1). 


2. For a given measurable set # on a t-axis and for any T > 0, let prl 
denote the ratio of the Lebesgue measure of the common part of FY and the 
interval —7T=¢=T and of the length 27 of this interval; so that 
0Sprk =1. If tends to a limit as o, the measurable set will 
be called relatively measurable,’ and lim prH will be denoted by pH. Trivial 
examples show that the relative measure p, though additive, is not completely 
additive, and that the common part of two relatively measurable sets need 
not be relatively measurable. 

Let (t),-- © <t< + o, bea given measurable vector function whose 
components x/(¢) are coordinates of a pointe C Ry. For any bd), 
let [x(t) C Q]r, where 7’ > 0, denote the set of those points ¢ of the interval 
—TStST at which a< z(t) <b. Since this set is measurable, one can 
define a family $7(Q), 0< 7’ <-+ o, of distribution functions by placing 


(2) or(Q) = prle(t) Q]r. 


The notation (1), when applied to (2), is equivalent to 


(3) L(u; dr) = Mr{exp 


ig 
where Mp{g(t)} = (1/27) g(t) dt. 


Now, if there exists a distribution function such that dr > ¢ as T> 
then a(t) is said to have an asymptotic distribution function, ¢. In view of 
(3) and the continuity theorem of Fourier-Stieltjes transforms, this will be 
the case ™ if and only if the limit M{exp iu: a(t)}, where M = lim M7, exists 


uniformly in every fixed sphere | wu | < const., in which case 
(4) L(u;¢) = M{exp ax(t)}. 


And @ may be obtained from (4) by an application of the inversion formula 
of the transform (1). 


* Cf. E. K. Haviland, ibid., pp. 94-100. 
"Cf. loc. cit. pp. 74-75. 


480 PHILIP HARTMAN, E. R. VAN KAMPEN AND AUREL WINTNER. 


2bis. Notice that (2) may tend, as T’— «, to a monotone additive set 
function also when z(t) does not have an asymptotic distribution function, 
In fact, examples of the type z(t) =?t (or x(t) =t¢sint) show that the 
limit (=0) of (2) need not be a distribution function. (And the continuity 
theorem of the Fourier-Stieltjes transforms has no analogue in this case, as 
shown by trivial examples). 

Thus, on using the notion of relative measure, and denoting by 
[z(t) C Q], where Q = Q(a, b), the set of those points of the ¢-axis at which 
a<xa(t) <b, one can say that the relative measurability of the set [7(t) CQ] 
for a dense set of intervals Q of R, is necessary but not sufficient for the 
existence of an asymptotic distribution function ¢ of x(t). In fact, one must 
also require that the total variation of 


(5) #(Q) = lim C 2(t) 


over f, (i. e., the total probability) should be 1. 


3. For a given interval Q = Q(a,b) of the k-dimensional space Fz, let 
where = 1,- -,k, denote the interval af < < bj on the a-axis, 
so that V7 =Q' &---X Q*. Let x(t) be a measurable vector function which 
has an asymptotic distribution function. Then each of the & components 
x/(t) of x(t) has an asymptotic distribution function 


(where it is understood that the limits (5) and (6) only need to exist on 
“nearly all” intervals Q, Q/ of kz, R,/ respectively). This is clear from the 
M-criterion of § 2, since the uniform existence of the limit M{exp iw: z(t)} 
on every fixed sphere |u| < const., where w= implies the 
uniform existence of the limit M{exp iuéai(t)} on every fixed interval 
| ui | < const. (In fact, put in the scalar product wu: a(t) all but the j-th 
component of wu equal to 0). 

On the other hand, it is quite possible that each of the k components 2’ (1) 
of x(t) has, while x(t) itself does not have, an asymptotic distribution 
function. In order to see this, it is sufficient to choose on the ¢t-axis k sets 
S', S?,- - -,S8* in such a way that while each of them is, their common part 
is not, relatively measurable, and then define x/(t) to be the characteristic 
function of where 7 = 1,-- -,k. 

4. Let real, measurable, scalar functions 2/(1), <t<-+%, be 


called statistically independent, if, on the one hand, the vector function 
a(t) = (a'(t),- - -,v*(t)) has an asymptotic distribution function (5), and, 


ASYMPTOTIC DISTRIBUTIONS AND STATISTICAL INDEPENDENCE. 481 


on the other hand, (5) may be expressed in terms of the k asymptotic dis- 
tribution functions (6) as follows: 


(7) $(Q) =9(Q' XK QF) = 6*(Q*). 


It is clear that if the & functions 2/(t) are statistically independent, then 
so are any / —1 functions in this set of & functions. On the other hand, 
if each of the k sets consisting of k—1 of k functions x/(t) represent 
statistically independent functions, then the & functions 2/(1) need not be 
statistically independent. ‘This is shown’* by the example of the k=3 


binary Walsh products 
a(t) =< r(t)r(2t), x*(t) = r(2t)r(4t), x*(t) = r(4t)r(t) 


of Rademacher functions r(2"t), where r(¢) = sgn sin 2at; and, correspond- 


ingly, by the continuous example of the k = 3 periodic functions 
z(t) = cos ft, x? (t) = cos at, z*(t) =cos(1+ 


5. Let & real, measurable, scalar functions on —ao <t<-+o be 
called statistically independent in the additive sense, if, on the one hand, each 
of them has an asymptotic distribution function, and, on the other hand, 
the sum of them has an asymptotic distribution function which is represented 
by the convolution of the k one-dimensional distribution functions. 

In this definition, the convolution y = y' *- - - * y* of k& one-dimensional 
distribution functions is meant in the usual sense ® and denotes, therefore, the 


distribution function which may be defined by 
k 

(8) L(A;¥) =J[ L(A; where <A<+ 0; ef. (1). 


In § 7, it will be convenient to think of the convolution of k& one-dimen- 
sional distribution functions in a geometrical fashion, as follows: 

If (J is an interval on the line R,) and y/ = y4(Q!) a distribution func- 
tion on where j - -,k, let of =o/(2/), where 
denote the point function which represents the value of the set function y/ 
for the half-line whose upper end is the point z/. For a given real number 
let o,34 = denote the point function on which is equal to 
1 —oi(xi/u/) or $(1 + sgn according as ui > 0, ui <0 or 
w—(. Then the one-dimensional distribution function, say p, which is the 
convolution of the & distribution functions Wy,’ associated with the k point func- 


™“ Cf. S. Bernstein’s example, mentioned loc. cit. *, p. 10. 


set 
he 
ty 
as 
by 
ch 
he 
ist 
et 
ts 
ie 
) 

J 

] 

) 


482 PHILIP HARTMAN, E. R. VAN KAMPEN AND AUREL WINTNER. 


tions o,// may be obtained as follows: Consider on R; = Raz K++: K Ry 
the distribution function ¥(Q) which is defined by 

and project this distribution in #, orthogonally on that line, [,, through the 
origin of #, whose direction cosines are proportional to the components w/ of 
u, where u(0,:--+,0). Then the resulting one-dimensional distribution 
on Ty, is precisely p. 

This may easily be verified either directly ** or by using Fourier-Stieltjes 
transforms.’* 


6. The notions of “statistical independence” (§4) and “ statistical 


independence in the additive sense” (§5) are not equivalent. In fact, let ® 
a(t)=0, 4, 4 for OSt<}4, 

and 

«*(t) =— 0, —4 for b= b< %, % = %, % = t 1% 
respectively, while 

x(t) =2'(t+1) and =2°(t+1) 

It is easy to see that these / —2 periodic step functions z'(¢), 2?(t) are 
statistically independent in the additive sense, without being statistically 
independent. It may be mentioned that for this 2(t) = (a'(t),a?(t)) the 
functions u'z'(t), u?x?(t), where u', u? denote arbitrary distinct non-vanishing 


constants, are not independent in either sense. 


12 An early instance of this interpretation of the convolution process is implied by 
Sommerfeld’s approach to the Gaussian law in the particular case of the “ Abrundungs- 
fehler ” (Boltzmann Festschrift, Leipzig (1904), pp. 848-859 or Bulletin of the Calcutta 
Mathematical Society, vol. 20 (1930); cf. also V. Brun, Norsk Matematisk Tidsskrift, 
vol. 14 (1932), pp. 81-92 and F. Tricomi, Jahresbericht der. Deutschen Mathematiker 
Vereinigung, vol. 42 (1933), pp. 174-179). 

13 Of., e.g., H. Cramér and H. Woldt, Journal of the London Mathematical Society, 
vol. 11 (1936), pp. 290-294. 

15 Another example, based on k = 3 infinite Rademacher sums, was communicated to 
us by Dr. M. Kac; they correspond to the k = 3 Besicovitch almost periodic functions 


wi(t) ~ cos ¢ ; (> (4,/)?< +), 
n=1 n=1 
in which {r,} is any sequence of real numbers which are linearly independent in the 
="), and the three sequences {a,j} are defined as follows: 


rational field (e.g., A,, 


Thi 2= 1 2= 1 aceordi = 3—q 1 a2 
while a, a, or 4,, a, according as » > 1 or n 1, finally a, a,1 +4, 


for every n. 


Ve 


C 

u 

t 

e 

q W 

a 

d 

ir 

as 

d 

| u 

uj 

th 

tr 

ex 

th 


ASYMPTOTIC DISTRIBUTIONS AND STATISTICAL INDEPENDENCE. 483 


7. We shall, however, prove the following theorem: 

k real, measurable, scalar functions 2/(t), — 0 <t< ++ o, are statisti- 
cally independent if and only if the & functions wia/(t), where u’,: - -, uk 
denote k arbitrary real constants, are statistically independent in the additive 
sense. 

In order to prove this theorem, put x(t) = (a'(t),- --,a*(t)) and 
u=(u',---,w*). Since a distribution function and its Fourier-Stieltjes 
transform determine each other uniquely, one sees from (1) that (7) is 


equivalent to 


L(u3$) exp(iu- 2)$(dke) 


k k 
j=le j=1 


Since this is an identity in wu, one can also say that (7) is equivalent to 


(9 bis) L (Au; $) L (Au! ; 


where wu is any vector and A anyscalar. Hence, on considering u = (u’,:--, u*) 
as an arbitrary fixed vector and A as the variable on a one-dimensional space 
Ry, one sees from (4), from the M-criterion for the existence of an asymptotic 
distribution function (§ 2), and from the criterion (8) for a one-dimensional 


convolution y= y' *y?*- - -y, that 


(i) if it is assumed that the & scalar functions z/(t) are statistically 
independent, then the scalar function uw: a(t) of t, where u = Const., has an 
asymptotic distribution function, which is the convolution of the asymptotic 
distribution functions of the k scalar functions wiri(t) of ¢, where every 


u/ = const.: and that 


(ii) if it is assumed that the k scalar functions wii (t) of ¢, where every 
ui = const., are statistically independent in the additive sense for arbitrary u/, 
then the vector x(t) and each of its components 2/(t) have asymptotic dis- 
tribution functions, @ and ¢/, which satisfy (9 bis), i.e. (9), hence also (7). 

Since the assertion of (i) is precisely the assumption of (ii) and vice 
versa, the proof is complete. 

8. On comparing (9) and (i)-(ii) with (4) and the M-criterion for the 
existence of an asymptotic distribution function (§ 2), one readily sees that 
the theorem proved in § 7 may be restated as follows: 


484 PHILIP HARTMAN, E. R. VAN KAMPEN AND AUREL WINTNER. 


k real, measurable, scalar functions 2/(¢) are statistically independent if 
and only if the limit M{exp wu: a(t)} of Mp{exp iu-2x(t)} exists uniformly 
in every fixed sphere | uw | < const. and is such that 


(10) M{exp iu: x(t) } = TI Mfexp (t) } 


is an identity in u, where z(t) = and u= (u’,: - -, u*), 

This criterion for the statistical independence of k functions x(t) ,---,a*(t) 
is due to Kac* in the particular case where the k functions z/(t) have the 
common period 1, i.e. for the case where the relative measure p = lim py 
on the infinite line —«o <t<-+ o reduces to the ordinary Lebesgue 
measure on the interval 0=¢=1. Since the latter measure is, while the 
relative measure is not, completely additive (and such as to preserve measura- 
bility when passing from two measurable sets to their common part), the 
approach used loc. cit. in the case of periodic functions cannot be modified 
in such a way as to apply to the general case of relative measure. Methodically, 
the situation is that the proof given loc. cit.* in the particular case M{q(t)} 


1 
= f g(t)dt merely depends on the uniqueness theorem of Fourier-Stieltjes 


transforms; the Dirichlet discontinuity factors* merely serve the purpose of 
proving this uniqueness theorem.*® On the other hand, the above treatment 
of the general case depends very much on the continuity theorem ® of Fourier- 
Stieltjes transforms; a theorem essentially deeper than the uniqueness theorem 
(which is, of course, implied by it, and also by the inversion formula *°). 


9. For the case of relative measure, Kac and Steinhaus* have proposed 
a definition of the statistical independence of the & components 2/(t) of a 
_ measurable vector function z(t). This definition differs from ours for two 
“yeasons. In fact, loc. cit. 

(1) it is not required (as it is in § 4) that the limit (5) of (2) bea 
distribution function (cf. § 2 bis) ; 

(II) it is required (as it is not in § 4) that the limit (5) of (2) should 
exist for every (and not “nearly all”) intervals Q (cf. the example of Bohr,’ 
mentioned in the Introduction). 

On admitting (II), one is compelled* to impose smoothness conditions 
on the asymptotic distribution functions, even in case the functions 2/(t) are 
bounded (in which case (I), of course, may be disregarded). Such smooth- 


Cf. e.g., E. K. Haviland, American Journal of Mathematics, vol. 56 (1934); 
pp. 625-658 (more particularly, pp. 638-641). 


es 


th 


fe 
tr 
h 
( 
h 
by 
( 
sl 

( 
d 
( 
al 
( 
| 


if 


ASYMPTOTIC DISTRIBUTIONS AND STATISTICAL INDEPENDENCE. 485 


ness conditions, when satisfied, are often hard to establish and are not, in 
general, satisfied even for continuous periodic functions. 


9bis. No such additional smoothness condition is needed if, on dis- 
regarding (1) and (II), one defines statistical independence as in §4. In 
fact, it follows from the M-criterion of the existence of an asymptotic dis- 
tribution function (§ 2) and from (4), that & bounded measurable functions 
ai(t) are satistically independent if and only if any monomial of the 2/(¢) 


has an M-average and 
(11) M{ (t) )™} = 
holds for k arbitrary non-negative integral exponents nj. This becomes clear 


by observing 1? that, in case of bounded functions 2/(¢), the monomial averages 
(11), which are then identical with the respective momenta 


(11 bis) II -I (dR), 

j=l 
simply are, up to the factors i™*---**n,!-1- + -+-m,!-1, the coefficients of 
(u1)™- + + (uk). in the (convergent) expansion of (4) according to powers 


of the components u/ of u. Actually, the same holds also when the functions 
a(t), a(t), instead of being bounded, are such as to possess asymptotic 
distribution functions which belong to determined momentum problems.*® 


As to “very unbounded ” cases, cases in which the momenta need not 
exist, cf. loc. cit. °, p. 76. 


10. Let, finally, be mentioned that the pathology of the relative measure 
»=limpr in the case of statistically independent functions appears to be 
essentially milder than in the general case. 

First, if the k components r/(t) of x(t) are statistically independent, 
then, by (5), (6) and (7), one has (for “nearly all” QC R,) 


(Ybis) pla(t) CQ] = CQ], 
j=1 


and vice versa. Since [z(t) CQ] is the set of those points ¢ which are 
common to the k sets [x#i(t) C Q/], it follows that the second of the two 


“Cf. E. K. Haviland, Proceedings of the National Academy of Sciences, vol. 19 
(1933), pp. 549-555. 

* Cf. A. Wintner, Mathematische Zeitschrift, vol. 36 (1933), pp. 618-629; E. K. 
Haviland, loc. cit. 1*; H. Cramér and H. Woldt, loc. cit. 3°. 


= 


486 PHILIP HARTMAN, E. R. VAN KAMPEN AND AUREL WINTNER. 


pathological properties of the relative measure mw (cf. § 2) does not concern the 
present case. 

The first of these properties, that concerning the failure of complete 
additivity, involves a more interesting and difficult question. In order to see 
this, let a sequence of measurable functions {z,(t)} be called convergent in 


relative measure on — © <t<-+ o if, on placing = lim py, one has 
T>0o 


(12:) an(t) —am(t)| > +0, n,m — for every fixed > 0; 


and let {z,(¢)} be said to tend to a limit function in relative measure if there 
exists a measurable function z(t), — 0 <t< + o, such that 


(122) B[| tn(t) —a(t)| >«] n— o, for every fixed «> 0. 


It is clear that if there exists such a limit function zx(/), it is uniquely 
determined by {2n(t)} save on a f-set of relative measure zero (in fact, 
p[| x(t) —y(t)| > «] =0 cannot hold for every « > 0, unless x(t) = y(t) 
almost everywhere in relative measure). It is also clear that (12.2) implies 
(12,); on the other hand, trivial examples show that (12,) does not imply 
(as it does in the case of Lebesgue measure on a finite interval) the existence 
of an x(t) satisfying (122). 

Now, it seems to be a reasonable guess that if the sequence {2,(¢) } is of 
the structure an(t) = where {&.(¢t)} is an infinite 
sequence of functions such that the n functions &,(¢),- - -,&én(¢) are statisti- 
cally independent for every n, then (12,) implies the existence of an a(t) 
satisfying (12.). The truth of this conjecture follows in some cases from the 
theory of infinite convolutions *°; while in another case, namely in the case of 
additive number-theoretical functions, the truth of the conjecture is suggested 
by a comparison of the sufficient criterion of Erdés *° with the necessary and 
sufficient condition *4 for the convergence of an infinite convolution. But we 
succeeded in constructing an example which shows that the theorem is wrong 
for arbitrary {é,(¢) }. 


QUEENS COLLEGE, 
THE JOHNS HOPKINS UNIVERSITY. 


Cf. B. Jessen and A. Wintner, loc. cit. °, Theorem 24 and §16. The problem 
raised above is (roughly) to the effect, whether or not the converse of this Theorem 24 
is true in case of finite sets which may be selected from the sequence {@ (t) —~a, (t)} 
consisting of statistically independent functions. 

°° P. Erdés, Journal of the London Mathematical Society, vol. 13 (1938), pp. 
119-127. 

*1 Cf. B. Jessen and A. Wintner, loc. cit. 5, Theorem 34. 


N+1 


V 

( 

| 

| 
\ 
| 
t 

3 


THE QUATERNION CONGRUENCE tat = 6b (mod g).* 


By R. E. O’Connor and G. PALL. 


1. Introduction. In this article we shall study the congruence 
(1) tat =b (mod g), 


where a and b are given pure quaternions, and ¢ is a variable quaternion, but 
g is a rational integer. Special cases have been treated by Pall, who con- 
sidered mainly those properties needed for certain applications to the equation 


(1’) h(8n +1) + 2,” + 2,7. 


We shall now determine for (1) necessary and sufficient conditions for solva- 
bility, in nearly all cases, and shall find the number and nature of the solutions. 
Interesting applications will be made, concerning the forms 2,? + a” + 2,” and 


(2) a0 b = a,b, + + 


Notations. The letters a, b, c, t, u,: - +, 2 denote (Lipschitz) quaternions 
with rational integer codrdinates (for an odd modulus g they may equally well 
be taken as Hurwitz quaternions) ; for example, t = ty + tt, + tele + Isls, 
the integer codrdinates being distinguished by subscripts; a, b, ¢ are pure, e. g. 
@ = 110, + + ty — — igh, — igts; Nt tt tp? + 
4,2. We set h= Na—a,? + Except for the quaternion 
units 74, letters with subscripts, as well as f,g,h,- - -,s, denote rational in- 
tegers; p an odd prime. It should be observed that for any ¢, fat is pure 
along with a. 

As usual the question of solving (1) reduces to the case g = p" or 2” 


(§ 14). On taking norms of both members of 


(3n) fat = b (mod 
we obtain 
(4n) (Nt)?Na = Nb (mod p"). 


Hence the solvability of (3n) requires the solvability for s of 


* Received by the Editors, June 15, 1938. 
*G. Pall, American Journal of Mathematics, vol. 59 (1937), pp. 895-913. Referred 
to as AJ. 
487 


. 


488 R. E. O'CONNOR AND G. PALL, 

(5n) = Nb (mod p") 

and the possible values of Nt are subject to 

(6n) (Nt)?*Na = s*Na (mod p"), sa solution of (5n). 
We shall find the striking result that: 


(7) af pT), there are equally many solutions t of (3n) with Nt in any residue 
class (mod p”) consistent with (6n). 


But this does not hold if p|b: we shall treat this case only if n = 1, and then 
if pfa, all solutions ¢ will be found to satisfy N¢==0 (mod p), even when 
p|Na. Our solution for pfb, n = 1, will be complete. 

The conditions obtained above on taking norms are sufficient for solva- 
bility (mod p") only if p{Na. Let p|Na. If p{b the solvability of (3,) 
will follow from that of (3,) and the norm condition (5n). In § 9, we prove 


THEOREM 1. Leta and b be pure, p an odd prime dwwiding Na and Nb 
but neither a nor b. Then (3) is solvable for t if and only if [a,b], =1, 
where 
(8) [a, b]p = (2a0b|p) if pfaob, 


= (m|p) if p|aob, 
m being in the latter case an integer determined by b= ma (mod p). 
The possibility of this is due to the interesting result, proved in § 8: 


THEOREM 2. Two vectors 43), be, bs) of norms divisible by p, 
are linearly dependent (mod p) if and only if p|aob. 


Thus the symbol [a,b], is defined, and equal to + 1 or —1, for every 
pair of non-null vectors of norm zero (mod p). There are p? — 1 such vectors 
(mod p), and they form two mutually exclusive classes, which we will refer 
to as K and JL, each containing $(p?— 1) elements, and having the following 
property: according as a and b belong to the same or different classes K or L, 
(3,) will be solvable or not, and [a,b], will be + 1 or —1. Properties of this 
symbol and of a similar one which tells whether the solutions of (1) with 
g =4 or 8 satisfy Nt =1 or —1 (4), are developed in §16. The following 
pretty property of Legendre symbols (ao |p) is proved therein: 


THEOREM 3. The value of [a, b]», defined in Theorem 1, is multiplied by 
(—1|p) if any coérdinate a,, a2, or ds is changed in sign; and by (—2\|p) 
if any two of a1, d2, are interchanged. 


i 
I 
] 
| 
q 


THE QUATERNION CONGRUENCE fat==b (mod g). 489 
By symmetry these operations may be applied to b. For example, let 
11i, + + 104, + + 2%, 


Then (a0b|p) = (146; 17) =-—1, and fat=b (17) is unsolvable. The value 
of (aob|p) remains —1, since (—1|p) = 1—= (—2|p), when the codrdi- 
nates of b are permuted or changed in sign ; except that 17|ao0b when + b = 2i, 
+ 10i2— Vis, and then + b=1la (17), (11/17) =—1. But if 


a= Ti, + 2. + 213, b= 51, — 41. —4i3, and p=19, 


then fat = b (19) is solvable with t = 6, since b =— 2a (19) and 6?=— 2; 
it is also solvable with b replaced by — 4%, + 5t. — 413, 40, 4° 412 + 51g, ete., 
when [a,b]p = 1; and not solvable with b replaced by 51, + 4%. — 443, 
4i, + — 413, etc., when [a, 


Of three vectors a, b, c of norm zero (mod p), either two or three are in 


the same class K or L; hence follows 
THeorEM 4. Let p be an odd prime, and let no two of 
(9) = (04, 2,03), b= (bi, b2, 63), C= (C1, Co, Cs) 


be linearly dependent (mod p) ; assume p|Na, Nb, and Ne. Then exactly one 


or three of the congruences 


(10) = bye, + doe. + == C1, + Cole + 
Qa,” = a1b, + debe + (mod p) 


are solvable in integers V3. 


The extension to g = p" (n > 1) is accomplished in § 11 for pfb. The 
modulus g = 2”, which was treated extensively in AJ, is discussed further in 
§§ 12 and 13. Composite g and non-pure a and 6 are considered in §§ 14, 15. 

A relation reminiscent of the connections between generic characters of 
quadratic forms is obtained in § 17 for the symbols [a, b], and the correspond- 
ing symbol (mod 4 or 8). This enables us to give, in § 21, a much more 
satisfactory extension of Theorem 1 of AJ concerning the equation (1’), for 
any h such that 


(11) h>0, hy¥4n, hA~A8n+ 7%. 


Sections 18-22 study the classes A and B of that theorem, and the distribution 
of values [a, b]>. 
16 


490 R. E. OCONNOR AND G. PALL. 


2. THEOREM 5. Let a and b be pure integral quaternions; p an odd 
prime not dividing h= Na. The congruence (3,) is solvable for an integral 


qualernion t if and only if 
(12) (Nb|p) = (Na|p) or 0. 
If p|Nb, (12) is evidently satisfied; then (3,) has 


the unique solution t=0, if b=0 and (— Na|p) =—1, 
(13) 2p?—1 solutions t, if b==0 and (—Na|p) = 1, 
6—=p—(—h|p) solutions 1, if b0. In all these cases Nt=0. 


If p{Nb and (12) holds, we can choose an integer s satisfying (5,) ; then 
(14) (3,) has precisely 26 solutions, 6 of them satisfying each of Nt=~= s. 


The necessity of (12) and of the respective restrictions N/=0 or +s, 
follows from the discussion of (3,) to (6;). 


Case 1, p{Nb. Then Nt is necessarily prime to p and we need only verify 
the following two statements : 

(«) If, for a given a of norm prime to p, « is the number of incongruent 
pure quaternion residues b (mod p) such that (Nb|p) = (Na|p), and d is the 
total number of residues ¢ of norm prime to p, then A = 26k. 


(8) If (3,) has a solution ¢t, (14) holds. 


Statement («) is an immediate consequence of formulae for the number 
of solutions of quadratic congruences.* For given h and s, both prime to p, 
the number of solutions (0,, bs, b;) of 


(15) + bo? + b,? =s*h (mod p) 


is exactly x = p?+ p(—h|p). The condition (Nb|p) = (h|p) for de- 
termining b is equivalent to (15) with s? ranging over $(p—1) quadratic 
residues. Hence «=3(p—1)x’. To evaluate A we remark that, of the 
p* residues t (mod p), exactly p* + p*— p satisfy to? + t,° + tp? + ¢,? =0. 
Thus = pt p? + p= p(p—1)2(p +1). The quotient A/« is 26. 

As regards statement (8) we can define a one-to-one correspondence be- 


tween the solutions of (3,) and those of 


(16) tat =a (mod p). 


2 We employ specializations, where needed, of such formulae in Bachmann’s Zahlen- 
theorie IV, 1898, pp. 491-492. They are due to C. Jordan. 


‘ 
f 
| 


THE QUATERNION CONGRUENCE fat =b (mod gq). 491 


For if w is a particular solution of (3,) and wu is any other solution, ww/Nw 
(mod p) is a solution of (16), having its norm congruent to + 1 or —1 (mod p) 
according as Nu is congruent to Nw or — Nw. Hence’ statement (8) follows 


from 


Lemma 1. If a ts pure and pth, the congruence (16) has precisely 
p—(—h|p) solutions satisfying each of Nt= +1 (mod p). 


Proof. If ¢ satisfies (16) and Nt=1, then at=ta and Nt=1; and 
conversely. Since pfa the condition at = ta, which expands into 


(17) Ast; = dit, = det, (mod p), 

is equivalent to the existence of an integer A satisfying 

(18) ty =A, te=Ado, =Adz (mod p). 

For any A, ¢, t2, fs are uniquely determined and ¢, is determined by 
(19) to? + A®h =1 (mod p), 


to which the norm condition Nt = 1 reduces. That (19) has 6 solutions (to, A) 
follows from Jordan’s formulae (cf. preceding footnote). 

The conditions (16) and Nf{==—1 reduce to the latter with at =— ta, 
that is, + + = 0, = dot) = =0 (mod p). Since pfa 
the last three are equivalent to t;==0. It remains to show that there are 


exactly 6 sets (¢,, ts, /,) (mod p) satisfying 
(20) + dot, + agtz =0, + + =—1, (mod p). 


We can suppose pT 4s, solve (20,) for ts, t; ==— — where ¢; = 4;/ds, 
€2 == d2/a; (mod p); and substituting in (202) need to see that there are 
p—(—h|p) solutions t2) of 


(21) (1+ e,7)t,? + + {(1 + + 1} =0 (mod p). 


If p|1 + (21) has no solutions with = 0, and for each of the p— 1 
residues ¢, prime to p, has an unique solution ¢,. Further, (—h|p) =1, 
since p| a3? + h =a,”, and (—1|p) —1. 

If pf 1 + (21) has for each ts, 1 + (Alp) solutions /,, where A is the 


discriminant of (21) considered as a quadratic in t,. Now 
4a,2A =— + g, where g =a." —h #0 (mod p). 


Hence A is a quadratic residue for 4(p—1— (—h |p) — (gh|p)) residues 


492 R. E. O'CONNOR AND G. PALL. 
t, and vanishes (mod p) for 1+ (gh|p) residues t,; leading to @ values as 
required. 

3. When p|Nb we cannot reduce the problem by the artifice used above. 
Writing 
(22) at =—2)+ 4,2, + tot. + i327, = €, tat = + + 13¢3 
as in AJ, § 11, we have 


(23 Lo = At, + dots + agts, = + — Azle, 
= Ugly + — Lz Agly + — 
(24) Cy = A, (2 (to? + 1,2) — Nt) + 22 (tots + tite) + tote + tits) 


the dots indicating that c. and c; are obtained from c¢, by cyclic permutation 
of subscripts 1, 2, 3, as 2 and x; from x, Among the large number of 
identities true for any ¢ and pure a the following will be useful: 


(250) Lo” + hto? = $ + + + hNt), 


25103 ht,? = 4 — — hNt 
2 


(27) atts + htats = (Goes + 
(28) aot, + tito = + a,Nt),---,- 
(29) Lots — °°; 


the dots indicating like formulae obtained cyclically; and h = Na. 


Case II, p|b. By (3,), c=b (mod p), and Nt=0; hence by (25), 
ht;? =— (f =0,1,2,3). If (—h|p) =—1, the only solution is 
Let (—h|p) =1. Since the right-members of (28)-(29) vanish, Zo, %1, £2, % 
are proportional to — éo, ¢,, ¢2, ¢; (mod p) respectively; that is 
(30) Lo =—Aly, T. —=Alz, (mod p) 
for some integer A. By (25), A must be one of the two solutions of 


(31) ? + h=0 (mod p). 


By (23) conditions (30) expand into 


(30’) Ato ayt, + Agls = 0, Arto — Al, Agts 0, 


i 
id 


THE QUATERNION CONGRUENCE fat =b (mod gq). 493 


All third order minor determinants in (30’) vanish in virtue of (31). Hence 
only two of the four congruences are independent. If p could divide every 
M+ dg? it would divide h. We can suppose pA? + a,*; then the first two 
congruences (30’) are independent, t, and ¢; may be selected arbitrarily and 
then ¢) and ¢, are uniquely determined. Thus for each A there are exactly p? 
solutions ¢, the only common solution for A and —A being t==0. Now (13,) 
will follow if the solutions ¢ of (30’) with A determined by (31) satisfy (3;). 
To see this note that (30) holds for the 2; so obtained, N (at) = 3a;? = 
hNt=—hNt by (81), Nt=0, tat = té=trAt = ANt =0 (mod p). 


4. Case III, p|Nb, pf{b. We can suppose pf{bi. Since p| Nt by (3:1), 
and the left members of (28) or (29) differ from the codrdinates of $tat by 
a multiple of V7, the solutions of (3,) coincide with those of either congruence 


triple 


with Nt=0. Multiplying the second triple by ¢,, tz, t; and adding we get 
(33) bit, + Dot. + bgt, =0 (mod p). 

Eliminating x) between (322) and (32;), and using (324), we get 

(34) + dst, — bets = 0 (mod p). 


If ¢, and ¢, are determined from (33)-(34) then Nf==0. Replacing the 2% 
in either triple (32) by their values from (23) and substituting for f) and t, 


from (33)-(34), we find the three congruences obtained from 


+ — d3b; — ab, ) ts? == 4b,? 


when both sides are multiplied by },, bz, b;. It remains only to show that 


(35) has exactly solutions (12, 
If pf azb; — a,b, —azb, the discriminant of (35) as a quadratic in tz is 
= 4b,?(— ht,? + g) (mod p), where g = a;b, — a,b; — 
The number of solutions is 6 by the same argument as at the end of § 2. 
If p divides both extreme coefficients in (35) then pt azb; + as3b2, so that 
there are p—1 solutions. For then p|a.b,, p|azb2—asb;, p|a:, and since 
pt Na, pt bs, and p| Nb, pt (a2? + + b,7), 


PT + = (a2? + a3”) (b2? + 37) — — asbs)?. 


But from p|azb; — a,b, —azbz and p|Nb follows (—h |p) = 1. For we have 


494 R. E. O'CONNOR AND G. PALL. 


Lemma 2, Let h=a,? + a,” + a,",(—h|p) =—1. Then the con- 
gruences 


(36) Aik, + + + kp? + ky? =0, (mod p) 
have no solution k,, ko, other than ky =k. =k, =0. 


The proof reduces to a discussion of (21) with the last + 1 replaced by 
0; the discriminant of the quadratic is —4hk.?/a;?, whence the lemma 
follows and the proof of Theorem 5 is complete. 


5. The norm-class (mod n) of #. By this we mean the residue class 
(mod 7) to which Nt belongs. Theorem 5 can be paraphrased succinctly thus: 


CorotLary 1. For pure a, b, Na being prime to p, (31) has precisely » 
solutions in each norm-class (mod p) consistent with (41); 9 is 1, 2p? —1, 
or p—(—Na|p) according as 


p|b and (—Na|p)=—1, p|b and (—Na|p) =—1, or pfb. 
The remainder of Theorem 5 may be deduced by solving (5,) for s. 


6. THeroreM 6. Let a and b be pure, integral quaternions, p an odd 
prime, p|Na, pfa. Assume that (3,) has at least one integral solution 
t—u. Then if p|b, (31) has exactly p* solutions t, all satisfying Nt=0. 
If p{b, (31) has precisely 2p? solutions t, of which 2p satisfy each of 


(37) Nt =r (mod p), r=0,1,°--,p—1. 


Except for the distribution of norm residues given by (37) this was 
proved in AJ (Theorem 11), by demonstrating the equivalence, if p|Na, of 


(38) tat =iau (mod p) and at = au (mod p). 
This equivalence can be deduced more simply as follows. Set 
au = Yo + + + 13Y35 = 


as in (22). Then (38,) becomes c=c’ and implies by (25)-(27) that 
= (f, 9 = 0,1, 2,3), whence + y; (f =0,1, 2,3) with all the 
signs alike. Conversely the latter conditions, which are the same as (382); 
imply by (25)-(27) that acy=aye’y (f,g =1, 2,3), c=c’. 

As in the discussion of (66) in AJ,at= «au (where when 
expanded reduces to two independent congruences, for example if pfa, to 


b 


re 


ne 


| 
a 
il 
be 
th 
B 
W 
j 
ol 
st 
fe 
re 
= 
(. 
he 


THE QUATERNION CONGRUENCE (fal =b (mod q). 495 


A,t, + det. + ast; ky = + dott + agus), 
Ayty — Aste + Ast; = ko, = — + deus), 


(39) 


the other congruences = = ey3, being linear combinations of these. 
Clearly &, and k. will be both zero (mod p) if and only if aw= 0, and hence 
by a special case of (38) if and only if daw=0. By (39), 


(40) to? + + to? + t,? = (ky? + ko?) /a,? 
— — /a,” — (kids + /a,?. 


In case p|d, k, =k, =0, t, and tz; may be chosen arbitrarily and to, t; 
are then fixed by (39); hence there are p” solutions ¢ and all have Nt ==0 
in view of (40). If p{b, consider (39) with a fixed sign, and (40). Let r 
be a given integer. Then N¢t==r (mod p) has precisely p solutions tz, ts and 


thus (39) has precisely p solutions ¢ with Nt =r, unless simultaneously 
kd, — =0, + kode =0, (mod p). 


But this would imply, since k, and k, are not both zero, that a.” + a;? = 0, 
whence p|a;. Since k, or kz is altered with the sign of «, the p solutions 
obtained with = 1 are distinct from those with «= —1. 

As shown on p. 911 of AJ it follows that exactly half the p* — 1 residues 
b (mod p) such that Vb =0 but b 0 are represented by tat, for a given a 


5 
such that p|Na but pfa, each being represented for 2p? residues t. The 


following corollary will not be used in this article but is useful in finding 
rapidly the set of residues tat (mod p). 


Corottary 2. If p|Na and pfa, the 4(p*—1) non-zero residues 
(mod p) represented by tat are obtained each twice when t is given the 
values ty + tb =0,1,-- p—1,t,=—0,1,- -,p—1, omitting 


For by the discussion of (39) each non-zero residue b is represented by 
fat for two values ¢ with t, =t,—=0. Evidently if pt az or dz, the same 
result holds with ty + tote, to + ists in place of ty + wh:. 


7. Lemma 3. Let p be an odd prime, r an integer, a2, a3 integers 
not all divisible by p, p|ai:? + a2* + a;".. Then the congruences 


(41) +a? +a,2?=0, a2, + dor. + =r, (mod p) 


have precisely p incongruent, integral sets of solutions (a, 


496 R. E. O'CONNOR AND G. PALL. 


We can adjust subscripts so that pfaza;. Then (41.) can be solved for 
x3, and the result substituted in (41,), yielding 


(42) (1 + e,7)x,? — 2e,(s — e222) 2, 
+ (1 + x2” — + s* =0 (mod p), 


where $=1/3, €2 = L3 =8 — — Since 


(42) has for each value z,, 1+ (A|p) solutions z,, where A is the dis- 
criminant of (42) taken as a quadratic in z, Using p| 1+ ¢,? + e.” we find 


A = 4(242¢,s — s*) (mod p), where 
If p|s, A vanishes (mod p) for each of the p incongruent choices of 22; if 
pfs, 4 is a quadratic residue for $(p—1) choices x2, and vanishes for one; 
in either case there are p solutions. 
8. Proof of Theorem 2. By lemma 3 with r= 0, when a is given there 


are exactly p residues b such that p|Nband p|aob. Since a, 2a,--- , pa (mod p) 
exhaust these possibilities, no two of them being congruent, Theorem 2 


follows. 


9. Proof of Theorem 1. For a given a, precisely $(p?—1) residues }, 
non-zero mod p satisfy [a,b]»—1. For, by lemma 3, there are p solutions b of 


ab, + dob. + asb; =r (mod p) 


for each of the $(p—1) values r such that (2r|p) = 1, and besides we have 
== ma with m one of $(p—1) quadratic residues of p; in all 


tp(p—1) +4(p—1) = 4(p? —1) residues 


That these $(p?—1) residues b coincide with the $(p?—1) non-zero 
residues represented by fat will be evident when we show that [a, b]p=1 
is a necessary condition for the solvability of (3,). This follows from (25), 
where now h =0 and c=b: clearly 2a0 b, must be a quadratic residue of p 
unless a0 6 =0; and then by Theorem 2, b = ma for some integer m ; hence 
replacing the c; in (25) by may, we have ma;?=2;* for f =1, 2, 3, that is 


(m|p) =1. 


Coroutary 3. If p does not divide a,b, but divides their norms, then 
(3,) is solvable only if [a, b]p = 1, and then has precisely 2p solutions in each 


norm-class consistent with the norm condition (4,). (Cf. § 5.) 


10. THrorem 6. Let a and b be pure integral quaternions, p an odd 


4 

{ 


THE QUATERNION CONGRUENCE fat =b (mod g). 497 


prime, p{b. Let y denote the number of solutions t of (31) in each norm- 
class (mod p) consistent with (4,). Then (3n) has precisely np" solutions 
t (mod p”) in every norm-class (mod p") consistent with (4n). 


Remarks. The values of » are given in corollaries 1 and 3. By solving 
the quadratic congruence (5,), by familiar methods, we can derive from 
Theorem 6 complete information, when p{b, regarding the number of solu- 
tions of (3,) and their distribution in norm-classes. For example: 


Let (Na|p) = (Nb|p) A0. Then (3n) is solvable for every integer 
n=1, and has precisely 26p""* solutions t (mod p"), half of them satisfying 
each of Nt=s or —s (mod p"), s satisfying (5n). Here by Theorem 5, 
6=p—(— Na|p). 

Let p divide Nb but not a nor b. Let p’ be the least power of p not 
dividing Nb. Assume that (3,) is solvable. Then if (5n) has a solution s 
when n= 1p, (3n) is solvable for every n. But tf (5n) ts not solvable when 
is solvable when n< 1%», and not when n=r. This follows 
from the fact that for n <r, (5) is trivially solvable; but is solvable with 


n =r, if and only if it is solvable when n = rp. 


11. Proof of Theorem 6. Let n >1 and denote an arbitrary solution 
of (8n1) by ¢, and of (3n) by w. Since every w is a t, we obtain each w 
(mod p”) exactly once in w=t-+ p""*u, at ¢ ranges over all solutions of 
(3n-.) (mod p"-'), and for each ¢, w ranges over all solutions of 


(43) + =b (mod p"). 
For u is hereby determined to modulus p, since (43) reduces to 
(44) fau + iat =— v (mod p), 


where tat—b = p""v. Set v= + i202 + and use (22) ; (44) becomes 


+ Loy — + = — 
(45) + + — U3 = — 
— + + Lolz = — (mod p). 


The matrix of coefficients of uo, U1, U2, Us has determinants equal to 


(46) + a NE, + 2,NE, + + respectively. 


The case p{NaNb is easy. Then pf Nt, whence p{Né = NaMt, and 


the numbers (46) are not all zero (mod p). Hence one of the ui may be 


498 R. E. O'CONNOR AND G. PALL. 


chosen arbitrary (mod p), the others then uniquely determined. Hence (3,) 
has p times as many solutions as (3n-,), there being p values u (mod p) for 
each 1. Lemma 4 follows for pf Nb. 

Let p|Nb, pfb. Then p|Né, and the matrix of coefficients in (45) is of 
rank <3. As shown in AJ in discussing (58), the linear combination of 
(45) with multipliers ¢,, ¢2, ¢; vanishes. The condition 


(47) + + C303 = 0 (mod p) 
is thus necessary to the solvability of (45). Since e—b = p™0, 
Nb = Ne — + Ue) + Cv + Te = 2 (e101 + Cove + C303). 


Hence (47) is equivalent to p"|Nb— Nc, which is (4n). A solution ¢ of 
(3n-1) will therefore not provide solutions w of (3,) unless (4n) holds; if (4n) 
holds we call ¢ eligible. If t is eligible and pfc; for f =1, 2, or 3, then the 
f-th congruence (45) is a linear combination of the other two. 

To compare the norms of w and ¢ we have 


Nw = N(t + p™'u) = Nt + (tu + ait) + p? iu, 
Nw = Nt + (totlo + tits + tote + (mod 


Thus Nw = Nt (mod p""), and Nw can be made to have any residue (mod p") 


consistent with this by choice of » in 
(48) + tru, + tots + =p (mod p). 


Consider now the system of three congruences (48) and (452) and (45s), 
under the hypotheses that p| Nb, ¢ is eligible, and pfc: (which we can suppose 
if p{b), whence (45) does reduce to its last two congruences, The matrix 
of coefficients of uo, Ui, U2, Us has determinants equal respectively to 


This can be seen conveniently as follows. Denote by A; (i= 0,1, 2,3) the 
determinant of order 3: formed by dropping the coefficients of ui; since 


té (to — i,t, — Int, — ists) (— Lo + + + 1323) 
is a pure quaternion, we have 
= — boty + + bot, + = Co, C1 = bots + — tots + 


but, identically in components of ¢ and &, 


| 
By 
i 
id 
| 


THE QUATERNION CONGRUENCE fat =b (mod q). 499 


Ao — = Ai — = — Co%o, 
+ = — A; — = — 


Hence the statement concerning the matrix is accurate. Since p cannot divide 
é without dividing c, the matrix is of rank 3: for any assigned integer p there 
are precisely p solutions of the system (45) and (48). 

Noting, finally, that (3) is unsolvable if p| Na but p{Nb, we have 


LemMA 4. If ¢ 1s a solution of (8n.) and n >1, then the congruence 
(49) waw = b (mod p”) 


has precisely p solutions of the form w=t-+ p™u, if p{Nb; if p|Nb, 
(49) has solutions of this form if and only if t is an eligible solution of (38n-1), 
that is t satisfies also (4n), and then (49) has precisely p? solutions w of this 
form, p of them satisfying each of 


(50) Nw = Nt + (mod 


In case pf Nb, represent the two solutions of (5n) by s and —s. These 
may be taken to represent the two solutions of (5n-,). If ¢ belongs to the 
norm-class s (mod p""), the p derived solutions w=t- p"™*u evidently 
belong only to the norm-class s (mod p"). For Nw == Nt, (mod p). 
This proves the case p{ Nb of 


LemMA 5. If (8n-1) has precisely k solutions (mod p"™") in each norm- 
class satisfying (4n-1), then (3n) has precisely kp solutions (mod p”) in each 
norm-class satisfying (4n). 


Let p|Nb. Then p|NaNt. Hence if ¢ and ?¢ are solutions of (3n-1) 
such that Nt == Nt’ (mod p"""), Na( NU’)? = Na(Nt)? (mod p"). That is all 
or none of the solutions ¢ in a norm-class (mod p""') are eligible. We may 
speak of an eligible norm-class. Every ¢ itt an eligible norm-class (mod p"*) 
leads to the same set of p norm-classes (p"), given by (50); and evidently 
no ¢” in a different norm-class can lead to any of these. This proves that 
every norm-class (mod p") which is derived from an eligible norm-class 


(mod p""*) contains precisely kp solutions w. The lemma will follow when 
we show that every norm-class (mod p”) satisfying (4n) is derivable from an 
eligible norm-class (mod p"). Let s satisfy (5n) and hence (5n-,). By the 
hypothesis of the lemma there is a ¢ of norm s (mod p"") (in fact & of them). 
Since p|NaNt, (Nt)?Na=s*Na= Nb (mod p"); so that ¢ is an eligible 
solution of (3,-,). Clearly s will be among the norm-classes (mod p”), as in 
(50), to which ¢ leads. 


500 R. E. O'CONNOR AND G. PALL. 


12. Modulus 2". Let a and 6 denote pure quaternions, neither divisible 
by 2. Set /—1 or 2 according as Na is odd or =2 (mod 4). It is shown 
in AJ that if n = 2, the congruence 


(51) tat = b (mod 12”) 

is solvable for ¢ if and only if the congruence 

(52) tat = b (mod 41) 

is solvable; that (52) is solvable if and only if 

(53) a=b(mod2), Na=Nb (mod 81). 
If n = 1 and / = 2, (51) is solvable if and only if 

(54) a=b(mod2), Na=WNb (mod 8); 
and if 12" = 2, if and only if a= b (mod 2). 

The condition, similar to (4), obtained on taking norms (AJ § 3) is 
(55) Na( Nt)? = Nb (mod 12"**). 
This implies that in all solutions of (51),° 
(56) Nt=s or Nt =—s (mod 2"), 
where s denotes a solution of 
(57) s*Na = Nb (mod 12"*), 


But, unlike the case of an odd modulus, all solutions t satisfy only one of 
conditions (56). (AJ, Th. 3.) 
Assuming (53), we define e(a,b) = -+ 1 or —1 according as 


(58) Nt=1 or Nt =—1 (mod 4) 


for the solutions of (52). The value of e(a,b) is given in a practicable form 


in AJ, Theorem 5. See also § 16. 


% A correction should be made in AJ, Theorem 6, p. 901. Formula (18) should be 
replaced by our present (56)-(57); and on p. 902, line 13, #¢ = +1 should be changed 
to f= +s. Further corrections: In (42), change the second + to —. On p. 908, 
line 8 from bottom, fau shculd be tau. On p. 913, line 6, change Nv to Nt. A desirable 
amplification of AJ, § 15 is included in our present § 21. 


i 
a 
| 


THE QUATERNION CONGRUENCE tat =b (mod g). 501 


12a. We omit the verification that the following more complicated, but 
possibly more significant, expressions for «(a,b) agree with AJ Theorem 5; 
but note that the verification for Na even is much simplified by use of the 
identity 
M+4K =H{(a,+),)? + (a.  b2)?} +4 (a3 + 
where 
M=1(Na+WNb), K=ao0b=aybi + debe + agbz. 


If K=2 aap 4), denote by & the residue (mod 16) of 4K between —8 
and +8, by (—)«x the sign of k. Then Theorem 5 of AJ implies the 
following : 
if Na==8 (mod 8), e(a,b)—=(—1) 
if Na=1 (mod 4), « =(— 1) %*-N)/8(2| K) (2| Na) ; 
if Na = 2(8), if 4|K, otherwise « =(— 1) 


where 8—=(—1)%-/8 if M=1, 8=(—1)"*(2|k) if M=5(8); 


if Na==6(8), «=(—1) if 4] K, otherwise « =(—);8, 


where if M=3, 8 if M=7(8). 
Note that if Na =2(4), K is even but cannot be = Na + 2(8). 


13. The determination of the number of solutions of (51) is compara- 
tively easy, for a and b restricted as above. Indeed, Nt being necessarily odd, 


(59) (51) has the same number of solutions t for every b for which it ts 


solvable. 


For if w is one solution, the number of solutions is the same as that of 
tat =“iau, hence of waw =a (mod 12”), where w = (ti)/Nu is, with t, an 
independent integral quaternion variable (mod 12”). 

Hence by merely counting the number of residues 6 (mod 12") satisfying 
(53) (or if n=1, 12, (54)), and dividing into the total number of 
residues ¢ of odd norm, we obtain the number of solutions ¢ for any solvable b. 

We may restrict ¢ to modulus /2”-' (if n >1). For, if ¢ is determined to 


modulus 2’(7 > 0), éat is determined to modulus 2"*?: 
+ + = iat + 2"(tav — Tal) + 


where the quaternion fav minus its conjugate is an even integer. 
Proceeding in this way we find the number of solutions of (51) is 


(59) if Na==3(8), 21 if Na==1(4), 2" if Na=2 (4); 


502 R. E. O};CONNOR AND G. PALL. 


¢t being taken (mod /2"*) and n= 2, (53) assumed. If 12” 2 and a=b(2), 
the number of ¢’s (mod 2) is 8; if (54) holds, n = 1, and 1 = 2, the number 
of v’s (mod 2) is 2. 


14, Extension to general odd modulus. If m, and mz are odd coprime 
integers and ?’,¢” are solutions of tal == (m,) and (mz) respectively, there 
is an unique solution ¢ (mod m,m.) of the congruences t=t’ (m,) and 
t=” (mz), and this is a solution of tat=b (mimz). It is easily seen that 
there is a one-to-one correspondence between the pairs (t’,#’) and the 
quaternions ¢; also between the pairs of norm-classes n, (say) (mod m,), 
ne (mod mz), and the norm-classes 1 (mod m,;mz). If there are n, solutions 
t’ (m,) in the norm-class n, and n. solutions t” (mz) in the norm-class ne, 
there are precisely m,n2 solutions ¢ (m,mz) in the norm-class n corresponding 


to (11,12). An easy application of these ideas gives: 


THEOREM 7. Let g = pi" po": + + ps’ where the pi are all different odd 
primes. Let a and b be pure integral quaternions divisible by none of the pi. 
Let m ((=1,2,--+,8) be the number of solutions of tat =b (mod p;) in 
each norm-class consistent with (Nt)*Na=Nb (mod pi). Then the con- 


gruence (1) has precisely 


solutions t (mod g) an each norm-class (mod g) consistent with 
(Nt)2Na = Nb (9). 


The »; can be evaluated by the criteria given above and the number of 


solutions of (1) readily determined. 


15. Extension to non-pure quaternions. Preserving the notation here- 
tofore used, the problem of solving 


(60) +a)t =b,. + b (mod g) 
is equivalent to that of solving | 
(607) tat =b, aNt=b, (mod g); 


and hence to finding the solutions of (60’;) in norm-classes consistent with 
(60’,). In particular, if g is odd and (60’,) is solvable, then (60) is solvable 


if and only wf 
r’-Na=WNb, rad) = bo (mod g) 


have a common solution r. A necessary condition is that 


fe 
T 
q 
( 
0 
I 
\ 
al 
a 
t 
( 
( 
( 
0 
l 
n 


THE QUATERNION CONGRUENCE fat=b (mod gq). 503 
do" Nb = bo? Na (mod g). 
Using Theorem 5 and considering the cases p|Nb and pf Nb, we have 


THEOREM 8. If g is an odd prime p; pt Na, pf do; then (60) is solvable 
for t if and only if 
ao” Nb = bo*Na (mod p). 


16. Properties of [a,b], and «(a,b). The symbol [a,b], defined in 
Theorem 1, evidently has the following properties, for any non-zero, pure 


quaternions @, 6, c of norm zero (mod p): 


(61 ) [a, a\p 1, [a, b\» [a, b lp [a, 
[a,Ab]p = (A|p) - [a, b]p for any integer A prime to p. 


When ¢ ~i,, tat = i,a, — i.a. —i:a;; thus the signs of two codrdinates 


of a may be changed without changing its class K or L (cf. sequel to Th. 2). 


Hence if a’ is obtained from a by changing the sign of one codrdinate, 
[a’, b]p = [—a, b]p = (—1|p) [a, b]p. 


When + tat = 2at, where at = + — 13033 set 


= + ina, + i343; then 
[a, b” = (2|p ) [a, 2b” = (—2 |p) [a, = (— 2|p) | a, |p. 


These two properties are summarized in Theorem 3. 
The symbol «(a,b) is defined only for pure a, b, neither divisible by 2, 
and satisfying (53). From Lemma 1 of AJ (p. 904-5) we immediately draw 


the following conclusions: 
(62) e(a,a) =1, «e(a,b) =e€(b,a), (a,b) 
(63) e(a’,b) = (—1|JZ) -e(a,b), e(a’”’, b) = (—2|H) -«(a,b); 


(64) if 1=2, (a, 5b) =—e(a, 


Here a’ is obtained from a by changing the sign of one codrdinate, a” is 
obtained from a by interchanging two coordinates of like parity, Na = l/, 
1—1or2. Since a and b are only determined for this symbol to modulus 41, 
these formulae provide complete data on the effects of interchanging codrdi- 
nates, changing their signs, or multiplying by an odd integer. We easily 


verify that if m is prime to 2h, 


(64’) mb) = (—- - 844-1) (J | m) (a,b) = (h|m) (m|H)e(a, b). 


= 


504 R. E. O'CONNOR AND G. PALL, 
17. Two proper representations of an integer h as a sum of three squares 
have the following property: 


THEOREM 9. Let a and b be proper, pure quaternions of norm h, 
a==b(2); sel ps, where or 2 and the are odd primes. 
Then 


8 
(65) IT [a, b]p, = (a, b). 
i= 
In the proof we use the following lemma from an article by Pall, “On 
the Arithmetic of Quaternions,” Lemma 17: 


LemMA 6. For a and b satisfying the hypotheses of Theorem 9, there 
exists a proper quaternion t of norm m prime to 2h, such that 


(66) tat = mb’, 
where b’ is obtained from b by permutations and sign-changes of b,, be, bs. 


Since tat==(Nt)a(mod2) (AJ (4)), (66) implies that a= 6b’(2). 
From (63) and Theorem 3 follows that if (65) holds with b’ in place of b 
it is true as stated. For a sign-change multiplies the left side of (65) by 


(—1|p:) -(—1|ps) = (—1]4), 


and the right side by the same factor; similarly for an interchange of two 
codrdinates, it being noted that b = 0b’(2). 
From fat = mb’(H), and hence from (66), follows [a, mb’]p, = 1, 


(67) [4, = (m| pi), IL [a, = (m|H). 
Similarly «(a, mb’) =1 or —1 according as Nt = m=1 or —1 (mod 4), 
e(a, mb’) = (—1|m), «(a,b’) = (—h|m)(m|H), 


the last step by (64’). The theorem will follow when we prove that 
(—h|m) =1. This is evident from identities (25). For c=mb’ and 
Nt =m, whence + ht;?> =0 (mod m) (f =0,1,2,3). Since some fy is 
prime to any prime factor of m, (—h|m) = 1. 


18. Extension of AJ Theorem 7. 


THEOREM 10. Let aand b be proper, pure quaternions of the same norm 
h =1H, 1=1 or 2; m an integer prime to 2h. Then if the g.c.d. of the 
codrdinates of tat is prime to h, and Nt and Nu are odd, 


bi 


(( 
ol 
D 
qn 
A 
pi 
( 

(7 
N 
by 
p 

in 
Pi 
(¢ 
th 
ti 

Q8 
(e 


THE QUATERNION CONGRUENCE fat = b (mod g). 505 
(68) fat = miibu (mod 4h) implies Nt= (h|m)Nu (mod 4). 


For, as in (67), since [a, tat], = 1 —[b, abu], for each prime factor p 
of H, and by (65), (a,b) =(m|H). Hence by definition of « and (64’), 


Nt =e(a, mb)Nu = (h|m) Nu (mod 4). 
The case m 1 gives at once 


THEOREM 11. Lel h be a positive integer not of the form 4n or 8n + 7. 
Denote by A the class of all residues ¢ (mod 4h), such that ¢,, 2, ¢3, h are 
coprime, obtained from c= tat (mod 4h) as a ranges over all proper, pure 
qualernions of norm h, and t over all integer quaternions of norm =1 
(mod 4) ; by B the class obtained similarly with Nt = 3 (mod 4). The classes 


A and B are mutually exclusive. 


19. THEOREM 12. Let h =Ip,": ps"* 7 (mod 8), 1—1 or 2, the 
pi distinct odd primes. Let a be any proper pure quaternion of norm h. 
For any of the 2° possible combinations of signs (each = +1), 


we can choose a proper b of norm h such that 
(69) [a, =i, (¢=1,-- -,8). 


That is, there are pure quaternions of norm / in every possible com- 
bination of classes K, Z for the various primes dividing H = h/I. 


Proof. We seek a prime p satisfying 


(70) (—h|p) =1, (p|pi) = «i, (¢== - -,8). 
Now (—h|p) = (—1|p) (A |p) = (—1 |p) (p| (— 1) 


by the latter part of (70). Hence (%0,) can be replaced by a condition on 
p (mod 4 or 8), and the system (70) satisfied unless 1 = 1 and H =3 (mod 8), 
in which case (70) can be satisfied if and only if = 1. 

In this exceptional case there are an odd number of powers pi”! with 
Pi = 8 (mod 4) and 7; odd. Suppose that 2%* b’s have been chosen, satisfying 
(69) for every combination of such that Then changing 


the sign of 6, changes each of the e; corresponding to pi = 3 (4) to its nega- 


tive, and leaves the remaining ¢; unchanged (Theorem 3). Clearly we get 
such that no two having the same set of values 


Having chosen p to satisfy (70) we can by (7%0,;), choose a) so that 


17 


506 R. E. O};CONNOR AND G. PALL. 


a)? +h=0 (mod p); hence the quaternion a +a has a right-divisor of 
norm p (AJ, p. 896). Say, Nt=p. Then +a4)t= tu: p, 
tat = pb, where b = tu—d. Thus [a,b], = (p|pi) 


20. The classes A, B of Theorem 11 are exhaustive. More precisely, 


THEOREM 13. Let h=JlH (l=1 or 2) bea positive integer not of the 
form 4n or 8n + 7%. As a ranges over all proper, pure quaternions of norm h, 
and t over all integral quaternions of odd norms, then tat represents (mod 4h) 
all pure quaternion residues c such that 


(71) Ne 


a 


= h (mod 8h). 
For by (71), Ne =0 (mod H). By Theorem 7 (or 6), 
(72) tat =c (mod H) 


is solvable for ¢ if and only if [a,c], 1 for each prime p dividing H (which 
can be satisfied, in view of Theorem 12, by choice of a) and the norm-condition 
s*Na = Nc (mod //) can be solved. Since H|Na and Ne, s is arbitrary. 


Further, by (71), Na= WNc (mod 81) for any a of norm h, and hence 
by an even number of permutations of the codrdinates of a (thereby not 
affecting the solvability of (72) ), we can securea=c (2). By AJ, Theorem 4, 


(73) tat = c (mod 41) 


is solvable. The moduli being coprime both (72) and (73) can be satisfied. 
From Corollary 3 follows immediately the 


CoroLuary. Theorem 13 holds with Nt confined to any residue class 
(mod #7). 


21. Extension of AJ Theorem 1. 


THEOREM 14. Jf 8n + 1 is not a square, and h is a positive integer not 
of the form 4k or 8k +7, the equation (1’) has equally many proper solutions 
in each of the classes A and B defined in Theorem 11. If 8n +1=m’ 
(m > 0), then all proper solutions are in A or B according as m=1 or 3 
(mod 4). 


By AJ, p. 896, any proper pure quaternion of norm hm? is of the form 
tat, with Na =h and Nt =m. Hence the second part follows from Theorem 11. 

Assume 8n + 1 not square. We shall set up a (4—4) correspondence 
between the solutions of types A and B. For this we choose a prime p such 
that 


a 
i 
d 
f 
( 
e 
l 
I 
( 
| 
( 
( 
| 


THE QUATERNION CONGRUENCE fut ==b (mod g). 50% 


(74) (—h|p) =—1, (—h(8n+1)|p) =1. 
This is equivalent to the following: 
(—1p)(— 18 (p[8n +1) =—1, 


and can evidently be satisfied except possibly when 11 and H =83 (8); 
it then reduces to (p|H) =—1, (p|8n +1) =—1; these are still evi- 
dently compatible unless H and 8n-+1 contain exactly the same prime 


factors, say 


At least one e; is odd since H/ =3 (8), and at least one f; since 8n +1 is not 
a square. If e; and f; are odd for the same 7, we choose (p| pi) == — ], and 
(p| px) = 1 for k 47; otherwise, for some j, and f; are odd, e; and fi 
even; and (p|pi) = (p|p;) =—1, the remainder + 1, is effective. 

Choose a solution a» of a? +h(8n-+1)=0(p). We employ the fol- 
lowing process with p and a, if x is a proper solution of (1’) of class A, with 
pand —a, if x is of class B. Since p|N(a +2), 


(75) Nv=—p, 


v being unique up to a left unit factor. Hence va = pa’, where 2’ = vu— ao. 


Taking conjugates gives 
Q 
(76) — +a’ = where Nt = p. 


Hence if the process be applied to a’, with p and —a», the right-divisors 
obtained are the left-associates of T, and since Ta’v = px, we are led back to z. 
If igv is employed in place of v in (75), 2 becomes — ig@’ia which differs 


from «’ in the signs of two of 2’,, 2’2, x’; only. Thus the four quaternions 


ds 
2) 


are associated with four similarly related quaternions 2’. The (4— 4) 
correspondence will be established when we show that if x is of class A, a’ is 
of class B, and conversely; for the method of choosing between a» and — ao 
assures that no « not already placed in the correspondence will lead to one 
already placed. By Theorem 13 we can set 2== tat, x2’ = waw (mod 4h) ; 
thus viati = paw (4h); by Theorem 10, since Nv=(—1 |p) (mod 4), 
Nit= (—h | p ) Nw == — Nw (4), the last step by (74,). 


22. THrorem 15. For a and h as in Theorem 12 there are equally 


508 R. E. O'CONNOR AND G. PALL. 


many proper, pure quaternions b of norm h having any of the 2° sets of values 


* 


In the proof of Theorem 12 we chose primes p = p(a,° * -,¢€s) and with 
each p an do = do(p). We use ad when applying the process to a quaternion 
having every = 1, otherwise —a . Then much as in the preceding section, 
with every quadruplet (77) having every «; 1 is associated 2° or 28 
quadruplets 6 with the respective values «,,- * -,¢€s; the 2%" applying only if 


h=8 (8), when e"- - -¢."*=1. If the process, with the corresponding p 
and — dp, be applied to any of these b’s we are led back to the original four 2’s. 
If = 1, we take p 1, so that (77) then leads to itself. 

If a b distinct from those already considered exists, there is one with 
every For applying to such a b the corresponding p(e,-° and 
—d(p), we obtain one with every «; 1 and distinct from those already 
considered, since 6 is. The theorem follows, but only for 1 if 
h=3(8). Then, changing the sign of b, evidently produces an equal 
number of b’s for each such that =—1. 

Let R,(h) denote the number of proper representations of h as a sum of 
three squares. In view of Theorem 3 we have the 


CoroLuary. Let s denote the number of distinct odd prime factors of h, 
h>3,hA4n,hA8Bn+%. Then 4-28|R;(h), and R,(h) = 12- 


For in case of no equalities or zeros, the 48 proper representations obtained 
from (2, %2, £;) by permutations and sign-changes have at most four values 


*,€s)3 proper representation 0) or (#1, can occur 
only if (—1]pi) =1 (or (—2|pi) =1) for every pi, and then there are 
at most two values *,€s). 


WESTON COLLEGE 
AND 
McGiLL UNIVERSITY. 


W 


of 


H 
la 
n 
6) 
al 
; 
sl 
al 
W 
Sd 
al 
t] 


QUASI-GROUPS WHICH SATISFY CERTAIN GENERALIZED 
ASSOCIATIVE LAWS.* 


sy D. C. Murpocu.' 


Introduction. In a recent paper* on the theory of quasi-groups, B. A. 
Hausmann and O. Ore have given an interesting analysis of the associative 
law, and have shown that it may be generalized to a very considerable extent 
without invalidating many of the principal theorems of group theory. In the 
present paper we shall consider an interesting class of quasi-groups which are 
not for the most part included under the laws discussed by Hausmann and 
Ore, namely, those which satisfy an associative law of the form 


a(bc) = (ab)c,, 
where c, is independent of b. 

In the first section properties of the right units of such a quasi-group G 
are derived. It is shown that G contains a set of minimal right unit sub-quasi- 
groups having no elements in common. At least one of these is contained in 
every sub-quasi-group of G. 

The second section deals with coset expansions. In order to obtain a 
suitable definition of a normal sub-quasi-group, it is necessary to assume, in 


addition, a second associative law, symmetrical to the first, namely, 
a(bc) = (ab)c¢ 


where a, is independent of b. Certain structural properties of such quasi- 
groups, and the Jordan-H6lder theorem, are proved in Section 3. 


The fourth section is devoted to quasi-groups which satisfy the law 
(ab) (cd) = (ac) (bd). 


This law is interesting since it not only implies both previous laws, but at the 
same time imposes a generalized commutative law. In fact, such quasi-groups 
are direct generalizations of Abelian groups. For this reason I have called 
them Abelian quasi-groups, in spite of the fact that they are not in general 
commutative. 

* Received May 19, 1938; revised October 15, 1938. 

* Sterling Fellow, Yale University. 

*B. A. Hausmann and Oystein Ore, “Theory of quasi-groups,” American Journal 
of Mathematics, vol. 59 (1937), pp. 983-1004. 

509 


510 D. C. MURDOCH. 


In Section 5 are discussed properties of quasi-groups which satisfy Law 1 
and have a unique right unit. These are the same as those satisfying postulate 
B of A. Suschkewitsch,* and have properties very similar to those of ordinary 
groups. ‘The final section is devoted to examples, and to methods of con- 


structing various types of quasi-groups. 


1. Quasi-groups which satisfy Law 1. Right Units. By a quasi-group 
G we shall understand a set of elements, closed under multiplication, in which 


the quotient axiom is satisfied. That is, the equations ax = b, ya=b are 
uniquely soluble for 2 and y where a and b are any two (not necessarily dis- 
tinct) elements of G. This implies both left and right cancellation laws. 
We shall be concerned only with finite quasi-groups, although many of the 
results obtained hold in the infinite case also. If G is finite, then every subset 
of G, which is closed under multiplication, satisfies the quotient axiom, and is 
therefore a sub-quasi-group of G. 
We shall assume that G@ satisfies: 


AssociaTIvVE Law I. If a, b, c are any three elements of G then 


(1) a(bc) = (ab)a, 
where c, is independent of b. 


From the quotient axiom it follows that every element a of @ has a right 
unit é. and a left unit e’c, defined by the equations 


lg = =A. 


These units are uniquely determined. Since in (1), c, is independent of }, 
we find on putting b = ea that 
a(eac) = 


and therefore c; ec. It is convenient now to introduce the following 
notation: denote by fa(c) the element defined by the equation 


€afa(c) 
where c is any element of G and a any fixed element. It follows that 
fa(@ac) =<, 


and hence the function f.~! inverse to fa is defined by 


8A. Suschkewitsch, “On a generalization of the associative law,” Transactions of 
the American Mathematical Society, vol. 31 (1929), pp. 204-14. 


= 
| 

= 


QUASI-GROUPS WHICH SATISFY CERTAIN GENERALIZED ASSOCIATIVE LAws. 511 


fa? (c) = 
Our associative law (1) then becomes 


(2 a(bc) = (ab) (c), 


where the second equation is a consequence of the first. 
If we put c =e, in the first equation of (2), we find that 


Cab Calls 


from which follows: 


THEOREM 1. The set of all right units of G form a sub-quasi-group R, 


and 1s homomorphic mapping of G on R. 


The homomorphism a — ea maps G on Fk, # on its right unit quasi-group 
k,, R, on its right unit quasi-group R.2, and so on. Since G is finite we must 
finally reach a sub-quasi-group #; which is mapped on itself. Since every 
sub-quasi-group of FR; must contain its own right units, it follows that 
a— €, is an automorphism, not only of R; but also of every sub-quasi-group 
of Ri. Now let #,#.,: -, 2H, be the set of all minimal sub-quasi-groups of 
R;. Since every sub-quasi-group of G contains a sub-quasi-group of R, and 
therefore of R,, +, Rt, it must contain at least one of these H;. These 


results may be stated as follows: 


THEOREM 2. Any finite quasi-group G satisfying Law 1 contains a set 
of minimal right unit sub-quasi-groups, no two of which have elements in 
common, and at least one of which is contained in every sub-quasi-group of G. 


If one of the minimal right unit sub-quasi-groups consists of a single 
element e, then e is its own right (and left) unit and will be called a principal 
unit. The set of all principal units of G form a sub-quasi-group, since if 


and = és, 


and hence e,é2 is also a principal unit. 

Since the minimal right unit sub-quasi-groups themselves have no sub- 
quasi-groups, they are necessarily cyclic quasi-groups in which every element 
is primitive. Moreover, every element is the right unit of some other element. 
They are not necessarily of prime order as might be expected, since the direct 
product of two distinct quasi-groups which have no sub-quasi-groups, will 


itself have no sub-quasi-groups. 


512 D. C. MURDOCH. 


2. Coset expansions and normal sub-quasi-groups. A necessary and 
sufficient condition for the existence of left coset expansions with respect. to 
any sub-quasi-group is obtained by Hausmann and Ore and may be stated as 
follows: 


If, for arbitrary elements a and b and a fixed element co, 


(ab) = ado, 
then for arbitrary c 


(ab)c = ad, 
where d belongs to the quasi-group {Co, do, c}. 


This condition is easily seen to be satisfied in our case if we put Co = éa. 
For then dy = bfa(ea), and hence 6, and therefore bfa(c) belongs to {éa, do, c}. 
Hence we have 


THEOREM 3. If H ts any sub-quasi-group of G, then G may be repre- 
sented by means of left cosets of H. 


If a sub-quasi-group H does not contain the right unit quasi-group f 
of G, then a coset aH will not always contain its defining element a. It is 
natural to ask whether all elements of the coset containing a define the same 
coset aH. This is easily proved to be so if we make G symmetrical by assuming, 
in addition to (1), 


Associative Law II. If a, b, ¢ are any three elements of G, then 


(3) (ab)c =a,(be), 


where a, is independent of b. 
Since a, is independent of b, we find on putting b = e’c, that 
= aC, 
and therefore a, = ae’c. Now if we define the function f’, by the equation 
=a, 
for all a and any fixed c in G, we find that 
f’c(ae’c) a, 


and therefore the inverse function f’.-’ is defined by 


f'c*(a) = ae'c. 


| 


QUASI-GROUPS WHICH SATISFY CERTAIN GENERALIZED ASSOCIATIVE LAWS. 513 


Hence our second associative law (3) becomes 


(ab)c = f’-(a) (be), 
a(be) = [f'e(a) be, 


the second equation being a consequence of the first. 


From (2) and (4) we have now 


a|bfa(c) | = (ae’c) (bc) =ale’cfa(be) |, 
and therefore 


(5) bfa(c) = e'cfa(be), 


for all elements a, b, and ¢ in G. Now if we let ¢ run through all elements of 
a sub-quasi-group H, while b is any fixed element of H, say h, the right hand 
side of (3) gives the same set of elements whatever element h is chosen. 
Hence the left hand side, hfa(//) is the same for any element h of H. The 
same is therefore true of 

(ah) H | 


and we have proved 


THEOREM 4. If G satisfies Associative Laws I and II, and H is any 
sub-quasi-group of G, then all elements of a coset aH define a fixed coset 
(ah)H which is equal to aH if and only if H contains ea. 


It also follows from (4), on putting ae’), that the left units of G 
form a sub-quasi-group L, and a— e’, is a homomorphism mapping G@ on L. 
By an argument similar to that of Theorem 2, we obtain a set of minimal left 
unit sub-quasi-groups -, H’s, one of which must be contained in 
every sub-quasi-group of G. It follows that these are identical, in some order, 
with the minimal right unit sub-quasi-groups, and may therefore be called, 
without ambiguity, minimal unit sub-quasi-groups. Law II also insures the 
existence of right coset expansions. 


A further consequence of equation (5) is: 


THEOREM 5. The product set (aH!) (bH) of any two cosets contains the 
coset (ab) H. 


For if c runs through all elements of H, (5) gives 
(6) bfa(1) C 
for all a and b in G, and therefore 


(ab) H | Hfa(bH) | = (aH) (bH). 


q 


514 D. C. MURDOCH. 


If equality holds in (6), that is if 
Hfa(bH) = bfa(H) 


for all a and 6 in G, then H will be called a left-normal sub-quasi-group of @. 
If #7 is left normal in G then ; 


(aH) (bH) = (ab)H, 


and the cosets of H evidently form a left quotient quasi-group G/H which is 


homomorphic to G. 


THEOREM 6. Jf H is left normal in G, the mapping of each coset aH 
on the coset defined by any element of aH is an automorphism of G/H. 
For 
[| (ah) H][ (bh) H] = [ (ah) (bh) | H = [ (ab) HH. 
Kvidently right normality could be defined by 


f'.(Ha)H = 


and a right quotient group would result. Since all theorems which hold in 
the one case will hold in the other also, we shall confine out attention to the 
left hand case. Unless otherwise stated, we shall use the terms coset, normal, 
and quotient quasi-group to mean left coset, left normal and left quotient 
quasi-group respectively. 

3. Structural properties and the Jordan-Holder Theorem. It is evi- 
dent that if we include a void sub-quasi-group O, containing no elements, and 
contained in every sub-quasi-group of G, then the sub-quasi-groups of @ will 
form a structure, or lattice. In order to prove the same for the normal 
sub-quasi-groups, it is necessary to confine ourselves to those normal sub- 
quasi-groups which contain the right unit quasi-group R of G. We shall 
show that these not only form a structure, but a Dedekind structure. 

Two sub-quasi-groups H and K of @ will be said to be permutable if for 
any two elements h and k we have 

hk=Kk'l’, 
where #’ is in K and h’ in H. 

THEOREM 7. Any two normal sub-quasi-groups H and K which have a 
non-votd crosscut D, are permutable. 

Proof. If h,k are any elements of H, K respectively, and d is any fixed 
element of D, then 


hk = (dh, ) = d[hifa(kid) | d[kifa(he) (dk,) he = 


QUASI-GROUPS WHICH SATISFY CERTAIN GENERALIZED ASSOCIATIVE LAWS. 515 


where /, is in K and hz is in H. Similarly from the normality of K, every 
product kh may be written h’k’. 


CoroLuary. If the crosscut (H,K) is not void, and if H and K are 
normal, then the union [|H, K | consists of all elements, and only those, of the 
form hk. 


THEOREM 8. The normal sub-quasi-groups of G which contain R form 
a Dedekind structure. 


Proof. To show that they form a structure, it is sufficient to show that 
the union of two such sub-quasi-groups is normal, since this is obvious for the 
crosscut. For a sub-quasi-group H containing #& the normality condition 
reduces to 


(7) H(bH) =fa*(b)H 


for all elements a and b in G. By the above corollary we have [H, K] = HK 


and from Theorem 7 and the normality of H and K, 


(HK) (b(HK)] = (KH)[(bH)K] =[(KH) (bH)]K 
— [(Kb)H]K = K[(bH)K] = (bH)K = fa? (b) [HK] 


for all a and 6 in G. Hence [H, K] is normal, and the normal sub-quasi- 
groups containing # form a structure, %. 

To show that it is a Dedekind structure, it is necessary to show that if M 
is any element of = containing H, then 


(M, [H, K]) =[H, (M, K)]. 


The right hand side is certainly contained in the left, and that the left is also 
contained in the right follows easily from the corollary to Theorem 7.* 

The Dedekind structure & will contain a unit element R which will be 
equal to R if R is normal. It follows from the properties of Dedekind 
structures ° that all principal chains of normal sub-quasi-groups between G 
and & have the same length, and that the quotient structures between suc- 
cessive terms in any one such principal chain are isomorphic in some order to 
those in any other. 

We can also prove the law of isomorphism just as in the case of groups: 


‘See Hausmann and Ore, page 995, Theorem 4. 
®Oystein Ore, “On the foundation of abstract algebra I,” Annals of Mathematics, 
vol. 36 (1935), no. 2. 


q 


516 D. C. MURDOCH. 


THEOREM 9. Jf H and K are any two permutable sub-quasi-groups of G 
which both contain Rk, and tf H is normal in the union [H, K], then the 
crosscut (H, K) is normal in K and 


[H, K|/H = K/(H,K). 


The Jordan-Holder theorem then follows, as in groups, for series of com- 
position between G and R. These theorems, of course, give us no information 
concerning quasi-groups in which G =, or in which G=RS~R. There 
are two special cases, however, which are of considerable interest and con- 
cerning which more can be said. These are the ‘ Abelian’ quasi-groups of the 
next section, and those in which F consists of a single element, which will be 
treated in Section 5. The latter restriction is a great simplification and leads 
to quasi-groups which are a special case of those discussed by Suschkewitsch, 
and which preserve many properties of ordinary groups. 


4, Abelian quasi-groups. In this section we shall consider quasi-groups 
which satisfy 


AssociATIVE Law III. Jf a, b, c, d are any four elements of G, then 
(8) (ab) (cd) = (ac) (bd). 


Such a quasi-group will be called Abelian since, although not in general 
commutative, it is, as we shall see, a generalization of an Abelian group. 

If a is any element of G we shall understand by a power of a, any 
element of the cyclic quasi-group generated by a. Such a power will be 
represented by the notation ¢,(a@) where r is the number of factors a which 
occur. We can then prove the following theorem which generalizes the law 
(ab)* =a"b" of Abelian groups. 


THEOREM 10. If a and b are any two elements of an Abelian quast- 
group, and if dn(a) is any power of a, then 


(9) gn (ab) = 


Proof. From (8) the theorem obviously holds for n = 2. Assuming it 
true for all powers involving fewer than n factors, since every power ¢n(a) 
can be written as the product of two such powers, we have, for some 1 less 
than n, 

gn (ab) = Yr (ab) xn-r(ab) = [Yr (4) Yr [xn-r (4) xn-r (D) ] 
= [Yr (a) xn-r(@) [Yr (0) xn-+(b) ] = 


The theorem therefore holds for powers with n factors, and therefore holds 
in general. 


f 


I 
| 
’ 
( 
| 
] 
8 
( 
] 


QUASI-GROUPS WHICH SATISFY CERTAIN GENERALIZED ASSOCIATIVE LAWS. 517 


CoRoLLARY. If gn(@), Wm(a) are any two powers of a then 


fn [ Wm (a) | = Um [dn (a) | 


This follows by repeated application of the formula which results from (9) 
on putting 6 =a. 

It is evident that in an Abelian quasi-group all sub-quasi-groups are 
normal. We therefore have R=, and Theorems 8 and 9 will give the 
Jordan-Holder theorem for composition series (which are here the same as 
the principal series of normal sub-quasi-groups) between G@ and Rk. Moreover, 
since the union of any two normal sub-quasi-groups is now necessarily normal, 
we can prove, just as in Theorem 8, 


THEOREM 11. Jn an Abelian quasi-group, the set of all sub-quasi-groups 
which contain a gwen minimal unit sub-quasi-group, forms a Dedekind 


structure. 


Hence all principal chains between G@ and any fixed minimal unit sub- 
quasi-group /; must have the same length, and the quotient quasi-groups 
associated with any two of these chains will be isomorphic in some order. 
However, the set of all sub-quasi-groups (including a void one) will not in 
general form a Dedekind structure, and therefore principal chains between G 
and two different minimal unit sub-quasi-groups, need not have the same 
length. 

If H ~ R, the quotient quasi-group G/H will have a unique right unit, 
since (aH) H = a|Hfa(H) | =aH, for alla in G. In particular this is true 
of R itself, and since G/R satisfies the same associative law as G, we find that 
left multiplication by R is an automorphism s of G/F and, in view of (7), 
we may write 

R(cR) = (chk)* = fa'(c)R 


for alla and cin G. Putting a= e., d =e, in (8), we find 


fa? (b)c = fa" (c) (bec). 
Now let b run through all elements of R, and we have 


Re = (c)R, 
and therefore, 
R(cR) = (ch)* = Re. 


These results may be stated as follows: 


THEOREM 12. Jf Gis Abelian, the quotient quasi-group G/R has a unique 


q 
| 
i 
| 
q 
= 
i 


518 D. C. MURDOCH. 


right unit Rk. Every left coset of R is also a right coset of R and conversely. 
The mapping ak — Ra is an automorphism of G/R and is equivalent to left 
multiplication by the right unit R. Finally if aR = Ra for all a in G then 
G/R is a group. 


This last statement follows since if a quasi-group satisfying any one of 
our associative laws has an absolute unit, it is a group. 


5. Quasi-groups with unique right unit. If G satisfies Law I and has 
a unique right unit e, then equations (2) become 


a(bc) = (ab)c* 


(ab)c = a(be*) 


where c* ec and ec**=c. Putting «=e in the first equation of (11), 
we find 
(bc)* = b*c* 


and therefore s is an automorphism of G. 
Right and left inverses of any element a are uniquely defined by 


aa* = 4.44 =e, 
and the solutions of the linear equations za = b, ay = 6 are then found to be 
(13) y= (a.b)*. 
Hence, putting b =e 
(14) = 


Evidently 
(a')_, = =a, 
and therefore 


THEOREM 13. The set H, of all elements which commute with e, 1s 4 
group, the largest group contained in G. 


This follows from (12), (14), and (11), since a? =a for all elements 
a of H. 

Since e is contained in every sub-quasi-group H of G, every left coset 
all contains its defining element a. In this case, therefore, it is not necessary 


t 

fd 

{ 


QUASI-GROUPS WHICH SATISFY CERTAIN GENERALIZED ASSOCIATIVE LAWS. 519 


to assume Law II in order to prove Theorems 4 and 5. Both these Theorems 
are obviously true whenever H contains the right unit quasi-group R. The 
definition of normality also becomes greatly simplified. A normal sub-quasi- 
group H, may be defined as one for which 


= Ha 
for every element a of G. Then if H is normal, we have 
H(aH) =a*H = Ha 


and therefore aH — Ha is an automorphism of G/H, and is equivalent to left 
multiplication by the right unit. If aH = Ha for all a in G, then G/H is a 
group. This is the case only if H contains the left units of all elements of G. 
Since k = R =e, the normal sub-quasi-groups form a Dedekind structure 
and the law of isomorphism and the Jordan-Ho6lder theorem hold exactly as 
in groups. 


If G is Abelian, putting a = d ~e in (8), we find that 
(15) b*c = cb 


for all b and ¢ in G. Hence when G has a unique right unit, (8) implies 
both (11) and (15). Conversely (11) and (15) imply (8), for 


(ab) (cd) = =al (b8c) d*|*" = al (c8b) d*]*" = (ac) (bd). 


Therefore, in the case of quasi-groups with unique right unit satisfying Law I, 
the Abelian quasi-groups may be characterized by (15). 

Let @ be any Abelian quasi-group with unique right unit e, and let $(a) 
denote any power of a. It follows from Theorem 10, that to any such power ¢ 
there correspond two sub-quasi-groups of G. The first, which we shall denote 


by Gg, consists of all elements x of G which satisfy the relation 


and the second, G‘”’, consists of all elements of the form ¢(a), where x runs 
through all elements of G. We can then prove 


THrorEM 14. The quotient quasi-group G/G¢ is isomorphic to G‘’. 


Proof. (a) =¢(b) if and only if b lies in aGg. For if b =ag where 


g is an element of Gg, then 6(b) = (ag) = $(4)¢(g) = Conversely 
if $(4) = (b), then 


520 D. C. MURDOCH. 


and therefore a,b belongs to Gg», and by (13) and subsequent equations, 
b belongs to aGg. Hence the correspondence 


(a) 


is evidently an isomorphism between G/G» and G. 
The transform a, of a by b, may be defined by the equation 


(16) ay = (ba*")b-, 


and the laws of transformation are easily deduced. Moreover the set C of all 
elements which are invariant under transformation by all elements of G, 
forms an Abelian sub-quasi-group, the centre of G. 

Quasi-groups of this type are a special case of those discussed by 
Suschkewitsch, who showed that they always contain a second operation x, 
under which they form a group. In our case this operation is defined by 


bX c=bde* 
and is obviously associative. Equation (16) then becomes 


which is the ordinary transform of a by 6 in the group G(X), since 6°" is 
the inverse of 6 with respect to the operation xX. 

Although every sub-quasi-group H of G gives rise to a sub-group H(X) 
of G(X), the converse® is not necessarily true. For example, the normalizer 
of any set of elements H of G can be defined as the set N of all elements a of 
G such that 

(aH*")a" = H. 


If a and b are in N, it follows from the laws of transformation that ab* 
is also in N and therefore V(X) is a sub-group of G(X), and the order of NV 
divides that of G, but we cannot say that N is necessarily a sub-quasi-group 
of G. It can be shown, however, just as in groups, that the number of distinct 
sets conjugate to H is g/n, where g is the order of G and n the order of N. 


6. Examples. The quasi-groups with unique right unit, which satisfy 
(11), are identical with those satisfying Postulate B of A. Suschkewitsch. 


* Suschkewitsch shows that this converse holds if and only if @ is obtained from 
an ordinary group by making a substitution in the headline of the group table, 
r 


where the exponents r are relatively prime to the orders of the corresponding elements X. 


¢ 

| 

e 

( 

q 

T 

is 

st 

( 

th 

St 

fo 
( 

T 

pr 

fo 


QUASI-GROUPS WHICH SATISFY CERTAIN GENERALIZED ASSOCIATIVE LAWS. 521 


They are the only ones included under both Law I of the present paper and 
the general law (ab)c—ad, where d is independent of a, discussed by 
Suschkewitsch. We shall refer to them for convenience as quasi-groups of 


type B. 


Suschkewitsch showed that any quasi-group of type B can be obtained 

from an ordinary group G(X) by making a permutation s in the horizontal 
5 

title line of the group table, where s is an automorphism of the group. For 


example, the set of all ‘ vectors’ of the form 
(17) = Qe, ° * 


where @,,° ° *,@» are arbitrary elements of a given group G(X), will form a 
quasi-group of type B, if we define the product of a by a similar symbol 6 to be 


ab = (4,0n, * , Qnbn-1). 


This merely amounts to making a Suschkewitsch substitution in the direct 
product of n factors G(X). 

A large class of Abelian quasi-groups can be obtained by a simple exten- 
sion of this method. Let G(X) be any Abelian group, in which multiplication 
is denoted by X, and let s and ¢ be any two automorphisms of G(X) such that 
st=ts. The elements of G(X) then form an Abelian quasi-group G, if 


multiplication in G is defined by 
(18) ab = X 


For then 


(ab) (cd) =a®” X K K d® = (ac) (bd). 


Moreover, s and ¢ are also automorphisms of G. If s is the identity, this 
is equivalent to making a Suschkewitsch substitution /* in the title line of 
G(X), and the resulting quasi-group is Abelian of type B. It easily follows 
that a quasi-group of type B is Abelian if, and only if, it is obtained by a 
Suschkewitsch substitution from an Abelian group. The symbols (17) will 
form an Abelian quasi-group, not of type B, if a:,@2,° - +,@a are arbitrary 
elements of an Abelian group G(X) and if multiplication is defined by 


(19) ab = (debn, OnDn-2, 


This again, amounts to performing a substitution of type (18) on the direct 
product of n factors G(X). Equation (18) is, of course, equivalent to per- 
forming the substitutions s* and ¢' on the vertical and horizontal title lines 


18 


522 D. C. MURDOCH. 


respectively, in the group table for G(X). An Abelian quasi-group formed in 
this way will be commutative if and only if s—¢#. It follows that if an 
Abelian quasi-group of type B is commutative, it is an Abelian group. 

The above method, however, does not give all Abelian quasi-groups. For 
every quasi-group obtained in this way will contain at least one principal 
unit, the unit elemtnt of G(X). That Abelian quasi-groups exist containing 
no principal unit is shown by example (3) below. The quasi-groups of order 3, 


12 3 12 3 

1/1 3 2 1/1 3 2 S18 i 3 
(2) 2/3 21 (3) 211 38 2 
2 1 3/2 1 8 2 


illustrate three different types. They are all Abelian; (1) has a unique right 
unit; in (2) every element is a principal unit; (3) contains no principal unit 
and has no sub-quasi-groups. The direct product of (3) with another quasi- 
group having no sub-quasi-groups would give a quasi-group of composite order, 
but with no sub-quasi-groups. Sylow’s theorem, therefore, does not hold in 
general. Also (2) illustrates the fact that an Abelian quasi-group is not 
necessarily the direct product of cyclic quasi-groups. A non-Abelian cyclic 
quasi-group of type B, is given at the end of Suschkewitsch’s paper. It is 
obtained from the symmetric group of order 6. 

Of interest, too, is the following Abelian quasi-group of order 9 which 
cannot be obtained directly by the method described above: 


26679 
3.56497 38 
21231645 7 8 9 
41/6457 89231 
M4644 
546449 7842-2 
2.4 5 6 
8 12°83 6 4 
@ 32% 


It is the right unit quasi-group of the quasi-group of order 27 consisting of 
all elements of the form (17) where n = 3, a, do, a3 are arbitrary elements 
of the cyclic group of order 3, and multiplication is defined by (19). 


YALE UNIVERSITY. 


? ( 
( 
a 
a 
§ 
t 
a 
i 
b 
i a 
C 
0} 
a 
ce 
M 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES.* 


By Kurt FRIEDRICHS. 


Symmetric differential operators from the point of view of Hilbert space 
present principally two problems. 

The first problem is to define the domain of the operator (i. e., the mani- 
fold of functions on which it is applicable) in such a way that the operator is 
self-adjoint (or hypermaximal). According to the theory of v. Neumann and 
Stone the problem of obtaining the spectral resolution of a non-bounded 
operator is completely reduced to that of establishing its self-adjointness.’ 

The second type of problems refers to the properties of this domain,* 
in particular the question arises under which circumstances functions of this 
domain are continuous and have derivatives. 

The first problem may be exemplified in the case of the operator 


0 0 
0x,? 


acting on functions u(a,° * *,2@n) which are defined in a region § and vanish 
at the boundary of S.A theorem of v. Neumann ® states that an operator is 
self-adjoint if it is the product of two adjoint operators. This suggests con- 
sidering the above operator as the product D*D of the gradient D, which 


transforms the function uw into the system 


| 


0x, 02, 
and the negative divergence D*, which transforms a system v’ = {v1,° * +, Un} 
into the function 
Ov, OVn 
OX, 


* Received May 25, 1938. 

‘It may be noted that, on the basis of this paper, the theory of the corresponding 
boundary value and eigenvalue problems is, therefore, obtained by straightforward 
application of the existing theory of the Hilbert space. For a different appreach cf. 
Courant-Hilbert, Methoden der mathematischen Physik, vol. II, chap. VII. 

* Cf. I. Halperin, “ Closures and adjoints of linear differential operators,” Annals 
of Mathematics, vol. 38 (1937), p. 880. A complete characterization of closure and 
adjoints is given for ordinary and hyperbolic differential operators of any order and 
certain elliptic cases are treated. Cf. also F. J. Murray, Transactions of the American 
Mathematical Society, vol. 37 (1935), p. 301. 

*J. v. Neumann, Annals of Mathematics, vol. 33 (1932), p. 294. 

nos 


One 


f 
= 
i 


524 KURT FRIEDRICHS. 


Thus the problem is reduced to that of defining the domains (Go and 6) 
of D and D* so that these operators are adjoint. The domain Go of D is 
restricted by the conditions 


and by the boundary condition; the domain G’ of D* is restricted by 


f f {v, +077} -dtn< @ 
f (D*e’)*dz,- - -da,< @. 
S 


In order that the operators D and D* are closed the class of functions wu 


and 


and v,,° °°, Vn, with continuous derivatives must be extended to manifolds of 
functions which are differentiable in a more general sense. For the operator 
D, this extension is possible by permitting differentiability in the sense of 
Lebesgue’s theory. A correspondent definition of D* does not seem to be 
obvious. Therefore, it seems preferable to effect the extension of both opera- 
tors in the following different way giving no preference to D or D*. 

First, we consider the operator D, applied on functions which have con- 
tinuous derivatives and vanish in the neighborhood of the boundary of 8. 
We then define D in Go as the closure,* and D* in W’ as the adjoint of this 
operator. We say: D in Ga is defined in the strong, and D* in ©’ in the 
weak sense. 

Secondly, we consider the operator D*, applied on systems of functions 
with continuous derivatives and define D* in W’ as the closure (strong sense), 
and ® in Gow as the adjoint (weak sense) of this operator. 

Our main theorem is that the strong and the weak extensions coincide.’ 

To prove this identity we use a sequence of a simple integral operators, 
which produce functions with derivatives of every order, which are commu- 
tative with D and D* even in the weak sense, and which approximate the 


unit in the sense of the quadratic metric. 


‘We may say: we replace the definition of the derivative as the limit of 
h*[u(a2-+h)—wu(x#)] by a different limit process. 

® Having proved the identity of both operators D* in @&’, we obtain the identity 
of both D in @%,, from the theorem of v. Neumann that a closed operator is the adjoint 
of its adjoint. 


e 
du \2 du \? 
u 

t 
t 

\ 
a 
0 
4 
a 
D 
| u 
ir 
ay 
th 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 320 


The theory of the operators D and D* is presented in §§ 1-6 for any 
number n of the dimension and for any open set S.° 

The §§ 7-14 are concerned with the application to the general symmetric 
elliptic differential operator of the second order, where the coefficients are 
only required to be continuous in the region S. We obtain the self- 
adjointness of this operator under two boundary conditions, which correspond 
to the vanishing of the function wu or of its normal derivatives at the boundary 
of 8.’ In §14 we apply the preceding theory to Schrédinger’s operator. 

In § 15 we deal with the second type of problems confining ourselves to 
the operators D*D —— A and —A-+ q without boundary conditions. On 
the basis of the theory of §§ 1-6 we obtain the following result, which supple- 
ments well-known facts of potential theory: 

Let n be the dimension of the w-space and introduce the number 


mM = + 1. 


Let w(x) be a function on which the operations YVi=—D, V2—= D*D, 
V;=DD*D,: -- up to Vr can be applied such that u, Viu,- Vru 
are L’-integrable. Then u(x) is continuous and has continuous derivatives 
up to the order — m provided that r= m.* 1° 

An immediate consequence is this: Let u(x) be a characteristic function 
of the operator D*D = -— A under a certain boundary condition; (by char- 


°No particular behaviour at the boundary will be required of the functions under 
consideration. For this, the concept of adjointness will be slightly modified. 

It may be noted that a similar treatment is possible for any /”-metric, where 
p=1, instead of Hilbert’s L?-metrie. 

7For a treatment of the manifold of all self-adjoint boundary conditions for the 


operator I by 
ceedings of the National Academy of Sciences, vol. 24 (January, 1938), 1, pp. 38-42 and 


+q and rectifiable boundary, see J. W. Calkin. Pro- 


a forthcoming note. 

* An equivalent statement (except for n=r=— 1) is this. Let wu be a solution of 
D*Du =f, where wu and Du are L?-integrable; then uw possesses continuous derivatives 
up to the order r-m if f is L*-integrable and admits the application of Y,, Yo -- > 
in the above sense up to the order r— 2. 

*It may be mentioned that a like proposition holds under any L?-metric, p21, 


n 
with m =(*] + 1, except for the case that p= 1 and n is odd, where m =n. 


The condition r=m is necessary for continuity since the discontinuous function 
#=log (c¢| log | «| |) admits the above process m— 1 times. 

Cf. S. Sobolev. Sur quelques évaluations concernant les familles des fonctions 
ayant des dérivées & carré intégrable. C. R. Acad. Sci. U R 8S 8S, N. s. 1, 279-282 (1936). 
There a corresponding fact is stated where the applicability of all differentiations of 
the order r, instead of V > is required. 


il 
| 

| 


526 KURT FRIEDRICHS. 


acteristic functions we mean not only eigenfunctions of point-eigenvalues ) 
but generally functions in the manifold of a projection belonging to any finite 
interval of the A-axis). Then w(x) has derivatives of every order. For, the 
above process can be carried on indefinitely, since wu admits the application of 
D*D and D*Du is again a characteristic function; (in case u belongs to the 
point-eigenvalue A this follows from D*Du—dAu). In §15 we shall estab- 
lish the corresponding fact for the operator D*D + q(x), thus including 


Schrédinger’s operator. 
The present paper generalizes and supersedes the results of two previous 


papers ** except those concerning point-spectra. 
1. Definitions. Let x= {2,,:--,2@n} be the points of an n-dimensional 


space; by S we denote open regions of this space. If S’ is a bounded region 
within § and if the boundary of S’ is in 8, too, we write S’ < 8. 


Integrals over a domain § shall be written by - + + daz instead of 


| 


We denote by @yg the manifold of all continuous functions u(z) defined 
in S to which a 8S” <S exists such that w(x) —0 outside of 8”; these 
functions shall be defined as identically zero outside of S. 

We denote by Sg the manifold of all functions u(z) which are L?-integrable 


in every region S’ < 8; we set 


| -f, | w 


We denote by &g the manifold of all u(x) in &g which vanish outside of a 
region S” < S; we set 
| u|s—= | u 
Two functions u(x), v(x) in Sy are called “equal” if | w—v|s =? 
for every 8’ < 8. 
For u(x), w(x) in &g the integral f. w(x)u(x)dz is defined and we 


have 


| | =| w || 


The same integral is defined for in in Vy. If | = | 


then 


w(x)u(a)dz = w(2)u(a) de. 


11 Mathematische Annalen Bd. 109 (1934), pp. 465-487, pp. 685-713. Bd. 110 (1935), 
p. 777. 


| 
| 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 527 
If u(x) is a function in &g such that 
w(x)u(x)dz = 0 
JS 
for all w(x) in Gg then = 0. 


2. The operator J, Let S’ be a region <8. We denote by S’, the 
region of all points y in § such that the region 


contains at least a point z of 8’. We assume a so small that 
< 8. 
We choose a function e(t) which has derivatives of all orders, for which 


e(t) =0 or =0 for |¢| 1 or 21 respectively, and 


1. 


co 


We define the kernel 


and the integral operator Ja which takes the function u(x) into the function 


Tau(z) 
If u(x) is in &g then Jau(z) is a function in Cs. 
The operator Ja has the following properties: 


(2. 1) | Jou | 
for u(x) in &s. 


(2. 2)22 | Jaw — wu |g 0, as a0 
for u(x) in &y. 

(2.3) | as a0 
for u(x) in if <8’ where | =| w |g". 


For u(x) in &g we have the relation 
(2.4) | Jau(x)|?S ja(y —x)| u(y) |?dy; 


2 For the case n = 1 this is contained in the results of K. Ogura, Téhéku Mathe- 
matical Journal, vol. 16 (1919), pp. 118-125 (Theorem II). 


| 
| 


528 KURT FRIEDRICHS. 


for we have 


| =| f ialy—2) —a) dy | 


From (2.4) we deduce 


J. | Jou(x) |? dx = u(y) |? dydx 


and thus the inequality (2.1). 


To prove the relation (2.2) we observe that u(x) can be approximated 
by functions u(x) in Gs such that | w—w, | g—>0 as b->0; in view of 
inequality (2.1) we get |Ja(u—w)|s—0, as b—>0. Therefore it is 
sufficient to prove | Jauw»— For z in 8’, we get 


| Jers | ja(y—2) — | 


S1.u.b. | w(y) —wus(x)| for | | <a,- | 2n—yn| <a. 


Since w(x) is continuous the right hand term tends to zero as a — 0; therefore 


| Jay — Ud — f. | J (x) da— 0. 


If u(x) is in Xs, | =| war, < 8’, we have Jau(x) = 0 outside 
of 8’; therefore | Jawu—u|s—=|Jaw—u|g. Thus we obtain (2.3). 


3. Systems. By 1 functions +, n(x) we form the 


system 
(x) = {vo(x), +, Un(a) }. 


The manifold of all such systems of functions in €, &, @ shall be denoted 
by & respectively. For v’(x) in &’g we set 


| =| vo | on 


For w’(x), v’(x) in and for w’(z) in v’(a) in Ly we set 


4 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 529 


Finally we set 


4, Differential operators in the interior. Let Ds and D’s be the spaces 
of all functions u(#) in Gg and v’(z) in G@’g respectively which have con- 


tinuous derivatives 
By D we denote the operator which takes u(z) in Dg into the function in @’s 
Du(z) = {du(z), Dyu(z),° }, 


where 8 is an arbitrary fixed number (which may be zero). By D* we denote 
the operator which takes the function v’(z) in D’s into the function in Gg 


D*v’ (x) = — Dv, (4%) + — Dnon(z). 


For u(x) in Ds and v'(z) in D’g we have 
8 


5. Operator Din Rs. We now extend the operator D from Dy toa 
space Ry C Ss. We define Ry as the manifold of all functions in &g for which 
a function u’(z) in &y exists such that 


D*w’ (x) - u(x) dx 
for all w’(x) in Os. If u(x) =0 then u’(z) =0, also. For, since u(r) =0 


in S we have { w'u'dz =0 for all w’ in 9g; from this one obtains ™ 
JS 
i, | u’ |?da —0, hence u’=0 in each S’< 8. In Ss, therefore, we can 
define a linear operator D by 
Du(«) = {8u(z), D, +, = w(x), 


Ds. 


which is evidently an extension of the operator D in 
We shall speak of a differential operator defined in this way as being an 


18. g. Set w’ = Jw’ and use (2.2) and (2.1). 


| | 
i 
| 
q 


530 KURT FRIEDRICHS. 
extension in the “weak” sense. Immediate consequences of this definition 
are the properties: 


5.1. If u(x) in Rs, w’(z) in $s, then 


(5.1) f° D*u’ (x) u(x) dz. 


5.2. If u(x) in in is such that 


(5. 2) f, D*w’ (x) -u(x)dz 


for all w’(z) in D’g then u(x) is in Ry and Du(z) = u'(z). 
As a matter of fact the operator D in Rs can be shown to be an extension 


in a “strong” sense also, for we have 


5.3. If u(x) in Xs, u(x) in ’s has the property that for every 8S’ <8 
@ sequence Ua(z) in Dg exists such that 


(5.3) | |g + | Dug |g > 0 
then is in Rs and Du(z) = w'(z). 


5.4. For every function u(x) in Ss and region S’ < S there exists a sequence 
Ua(z) in Dg such that 


(5. 4) | ta —u |g + | Dua — Du as a—>0."* 16 


Property 5.3 is an immediate consequence of 5.2: for every w’(x) in 
$s there is a S’ < 8 such that w’(z) —0 outside of §’; now we choose a 
sequence Ua(x) in Dz such that (5.3) holds, from (4.0), for w=, v’ =w’, 
we obtain (5.2) as a—0; thus, according to 5.3, u(x) is in Rs and 
Du(az) (2). 

In order to prove 5.4 we take the operator Ja. We observe that Jau(z) 
is in Dg if u(x) is in Qs. We then state the basic 


14 They do not exactly express that D in Qy is adjoint to D* in Dg since the 
functions u(#) in @g are not restricted to be in the Hilbert space defined by the unit 


form | dz. 


8 These properties do not exactly express that D in Q,y is the closure of D in Dg 


for the same reason as in footnote *. 
165.3 states that the strong extension is contained in the weak extension, while 


, 5.4 states the converse. 


i 

| 
I 
i 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 531 


LemMA 5.1. If u(x) ts in &g then 


DJqu(x2) = JaDu(z) 
for in 8’. 


To prove it we observe that, for »p ~ 0, 


a 
Duda) f ja(y — )u(y) dy 


9 ja(y—x)u(y) dy. 
S’a 


Since ja(y— 2), when considered as a function of y, is in Ds if x is in 9’ 
we can define a function w’(y) in by = ja(y—z), 
if v—=yp. Then we get 


D*w'(y) -u(y)dy. 


In view of 5.1 this is 


w(y)Du(y)dy = f —2) Duly) dy = 
S’a 


and thus DJqu(x) =JaDu(z). 
From this lemma we deduce 5.4 immediately. For if u(x) in Rs, we 
obtain from (2. 2) 
| Jaw — wu |g > 0 
and 
| — Du |g = | JaDu— Du > 0 as a0. 


Finally we define the space Rg of all u(x) in Rg which are in Ly and for 
which Du(x) is in &’s. Then we have 5.5 for every function u(x) in &s 
a sequence Uq in Dg exists such that 


(5. 5) | + | Dua — Du |s 0, as a0. 


We obtain this property from Lemma 5.1, on using (2.3) and choosing S” 


such that 
| U ls | U | Du ls = | Du S’ = S”. 


6. Operator D* in ®s. In a corresponding way we extend the operator 
D* from D's to a space Rs. We define ®s as the manifold of all v(x) in 2's 
for which a v(x) in &g exists such that 


17 Cf. for nm = 1 K. Ogura (l.c. 12), Theorem VII. 


= 
\ 
fi 
q 
| 
| 
| 


532 KURT FRIEDRICHS. 


f Dw (ax) dz 
8 8 
for all w(x) in Dy. As in § 5 we conclude that v’(z) =0 implies v(x) = 0, 
Thus we can define a linear operator D* in 8s by 
D*v’ (x) = v(2), 


which is obviously an extension of D* in D’s. We call it an extension in the 
“weak” sense. Immediate consequences of this definition are the properties 


6.1. If in w(x) in Dg, then 


(6.1) (2) (2) de Dw(2) (a) de. 

6.2. If in v(x) in &g is such that 

(6. 2) w(a)v(x)de— Dw(x)v' (x) dx 

for all w(x) in Dz then is in Rg and D*v’ (xz) = v(2). 
Further the operator D* in §’g has the following properties ** ”° 


6.3. If the functions v’(xz) in &s, v(x) in &’g have the property that for 
every region S’ < 8 a sequence vq(x) in D’g exists such that 


(6. 3) | va |g + | v > 0, as 0, 
then v’(x) is in R’s and D*v’(x) = v(z). 


6.4. For every v’(z) in &’s and region 8S’ < a sequence v’a(z) in Y's 
exists such that 


(6. 4) 


va—v + | — D*v’ 30, as a> 0. 

Property 6.3 is an immediate consequence of 5.2. Property 6.4 1s 
obtained, on setting v’a(x) = Jav’ (x), from 

LemMA 6.1. If v’(x) is in R’g then 


D*J qv’ (2) = JaD*v' (a). 


This lemma is proved in the same way as Lemma 5.1 on applying (6.1) to 
w(y) 

Finally we define §’s to be the space of all v’(x) in gs which are in &s 
and for which D*v’(x) is in 2g. Then we have 


15, 1°, 20 Remarks corresponding to *‘,1°,1° may be applied respectively. 


tk 


6. 
be 
j 

F 
is 
\ 
a 
U 
is 
I 
j 

i 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 533 
6.5. To every function in v’(z) in Rs a sequence v’a(#) in Dy exists such 
that 
(6.5) | —v’ |g + | — D*v’ |p 0, as a> 0. 
7. Closed spaces § and §’. In what follows we omit the subscript 8. 


Let r(x) be a positive continuous function defined in 8. Let R and R* 
be the operators which take a function u(az) in & respectively into the functions 


in 
Ru(z) =r(z)u(z), = r(x) u(z). 


For u(x), w(x) in &, 8’ < 8, the bilinear form 


J, w(x)Ru(x)dz 


is defined and we have 


u(x)Ru(x)dx > 0, for u(x) #0 in 8’. 
s’ 
We define 


(uRu) u(x) Ru(x)dz = 1. u. b. Ru(a) dr 


and introduce the manifold § of all u(z) in & for which (uRu) < «. For 
u(x), w(x) in the bilinear form 


(wRu) w(x)Ru(a)dx 


is defined. § is a Hilbert space and contains C densely. 

Let puv(z); (u,v be continuous functions defined in 
such that at every z in S the matrix pyy is symmetric and positive definite. 
Let P be the operator which takes every function v’(4) = {vo(x),°**, Un(x)} 
in 2’ into the function in & 


Py’ (x) = { > pov (2) 


p=-0 


For v’(x), w’(z) in &, 8’ < S, the bilinear form 


f w’ (x) Pv’ (x) dx 
e Ss’ 


is defined and we have 


(2) Pv’ (x)dz > 0, as vo’ (x) in 8’. 


= 
| 
| 
| 
| 


534 KURT FRIEDRICHS. 
We define 
(v'Pv’) = v (2) Pv’ (2)dz =1. u.b. v (x) Pv’ (x) dz 
8 <8 Ss’ 


and introduce the manifold §’ of all v’(z) in 2 for which (v’Pv’) < a, 
For v(x), w’(z) in the bilinear form 


(w’Pu’) = w’ (2) Pv’ (x) dx 


is defined. §’ is a Hilbert space and contains C’ densely. 
We denote by P*8’, P“*8’, P-'D’ the space of all functions v’(x) for 
which Pv’(xz) is in 8’, 8’, D’ respectively. In all these spaces the operator 


E = R'D*P 
is defined ; it produces functions in &, 2, € respectively. 


8. The spaces &, and Gx. We denote by ®o the space of all u(x) in 
§ which are in 8 and for which Du(z) is in $’. In this space the unit form 


|| w ||? = (wRu) + (DuPDu) 
is defined. We have 


THEOREM 8.1. The space ®,o is complete with respect to || u ||. 


This can be deduced from 5.3 by well-known reasoning. 

Obviously ©, contains R (cf. §5). We denote by Gx the closure of R 
with respect to || w |j.7? 

We have 


THEOREM 8.2. The space Ga contains D densely.” 


It suffices to prove that to every function u(a) in @ a sequence wa(z) 
in D exists such that || w—wu||-0 asa—>0. 
We choose a S° << § such that wu is in Ry? and apply 5.5 to S° instead 


of S; since the coefficients of R and P are bounded in S° and wa in Ds? is 0 


Dy, too, we obtain the statement. 


*1 The condition for a function of (§, to be in (@,. is a boundary condition; for 
if u, is in @q and U, in On differs from u, only in the interior then U, is in Gq, 100, 
because u,— wu, is in Q (cf. § 5, p. 8) and therefore in Gq. 

22 It would be possible to define (,, as the closure of . Then, instead of Th. 
8. 2, we would prove that @ is in (§,, and thus the condition defining @,. is a boundary 
condition. 


in 


sp 
is 
| 
di 
1 
( 
( 
fe 
1 
( 
fe 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 535 


9. The spaces ©’, and W’ax. We denote by the space of all v’(z) 
in § which are in P*§8Y and for which Ev’ = R*D*Pv’ is in §. In this 
space the unit form 

|| ||? = + 


is defined. As a consequence of 6.3 we have 


THEOREM 9.1. The space &’y is complete with respect to || v’ ||. 


Obviously G’, contains the space P+” of all functions v(x) for which 
Pv'(z) is in R”. We introduce the closure W’oo of P-'*&’ with respect to 
|” |.2° We state 


THEOREM 9.2. The space contains densely.** 
This follows immediately from 6.5, as § 2 had followed from 5. 2. 


10. The operators D in &, and E in G’x. The operator D defined in 
produces functions in §’ while = R-*D*P is defined in W’o and pro- 
duces functions in §. Between these operators the following relations hold. 


10.1. Ifv’(z) is in Wx, u(x) in Go, then 

(10. 1) (uREv’) = (DuPv’). 

10.2. If u(x) in §, w(x) in §’ are such that 

(10. 2) (uREw’) = (w’Pw’) 

for all w’(x) in then u(«) is in and Du(z) = u'(2). 
10.3 If v’(x) in §’, v(x) in § are such that 

(10. 3) (wRv) = (DwPv’) 


for all w(x) in then v’(z) is in Goo and = Ev’ 

10.1 and 10.2 imply that D in Go is adjoint to # in ©’o0o. 10.1 and 
10.3 state that in is adjoint to D in 

10.1 follows from 5.1 and Th. 9.2; 10.2 follows from 5.2. 10.3 is a 
consequence of a basic theorem of von Neumann * which states that a closed 
Gperator is the adjoint of its adjoint. It seems desirable to repeat his 


reasoning for our case. 


28 Of, 21, 24 Of, 2, 25 Of, 2, 


i 
| 
} 
| 
| 
i 


536 KURT FRIEDRICHS. 


Let [§, $’] be the space of all pairs [u(x), v’(x)] of u(x) in §, v’(z) 
in §’; it is a Hilbert space with respect to the unit form 


(uRu) + (v’Pv’). 


Let be the subspace of all pairs [H#v’(x), v’(x)] where v’(z) 
is in ©’. This space is closed according to the definition of Go. Let 
— _D@G_] be the subspace of all pairs [u(x),— Du(x)] where u(z) is 
in ©. According to 10.1 and 10.2 this space consists of all elements of 
$’] which are orthogonal to Wa]. Since [§, §’] is a Hilbert 
space the closed space contains all elements of [§, which 
are orthogonal to [®.,— D®,]. That is exactly the statement of 10. 3. 


11. The operator D in Go and E in &’o. Between the operators D in 
®oo and H in &’, the following relations hold 
11.1. If is in Ga, in then 


(11.1) (v'PDu) = (Ev’Ru). 

11.2. If v’(x) in §’ and v(z) in § are such that 

(11. 2) (v'PDw) = (vRw) 

for all w(x) in then v’(z) is in and Ev’(x) = v(2). 
11.3. If u(x) in § and w’(z) in §’ are such that 

(11.3) (w’Pu’) = (Ew’Ru) 

for all w’(xz) in then u(x) is in Gow and = Du(z). 


11.1 and 11.2 includes the fact that £ in ©’ is adjoint to D in Gx. 
They follow from 6.1 and Th. 8.2. 11.1 and 11.3 state that D in Go is 
adjoint to H in Gy; this is again a consequence of the theorem of von Neumann. 


12. Self adjoint operator ED in %. Let foo be the space of all func- 
tions u(x) in Goo for which Du(z) is in &’. Let %o be the space of all 
functions u(x) in ©» for which Du(z) is in Goo. The operator HD 1s 
defined in %oo and in %» and produces functions in §. 

It is convenient to denote by &, &’, § either Ga, Go, Fa or Go, Goo, Ho 
respectively. 

It is by no means evident, that the spaces % are dense in %, and that 
they contain functions other than u(x) =0. But according to a theorem of 


y 
| 
1 
1 
( 
( 
] 
( 
VC 
al 
( 
U 
= 
Ww 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 537 


von Neumann (1. c.*), % is dense and HD in % is self-adjoint (hypermaximal). 
Explicitly we have 
7y is dense in § with respect to (uRu). 


12.1. If and are in then 

(12. 1) (wREDu) = (EDwRu). 
12.2. If u(x), v(x) are functions in § such that 
(12. 2) (wRv) = (EDwRu) 


for all w(x) in then u(x) is in and = v(z2). 
Further we have 
is dense in with respect to (DuPDu). 


12.3. If u(x) is in %, w(a) in &, then 
(12.3) (wREDu) = (DwPDu). 
12.4. If w(x) in G, v(x) in § are such that 


(12. 4) (wRv) = (DwPDu) 


for all w(x) in G then is in & and HDu(z) = v(z2). 

We prove the preceding statement in a way,”* which differs slightly from 
von Neumann’s reasoning. From 10.1 or 11.1 we obtain 12.3; from this 
and 10.1 or 11.1 we obtain 12.1. We deduce from 12.2 that % is dense in 
§; for if wRkv = 0 for all w in then (12.2) is satisfied by 0; hence 
v=HDu=0. Further, we deduce from 12.2 that % is dense in ; for if 
(DwPDu) =0 for all w in % and some uw in & we find from 12.3 that 
(12.2) is satisfied by v hence w is in %; therefore (DuPDu) = 0 and 
u= (0). 


We thus need prove only 12.2. We first state 


LeMMA 12.1. Yo every function h(a) in § there is a function f(z) 
in such thal (BD + 1)f—h. 


To prove this lemma we observe that (wR/) is a linear form for functions 
w(x) in ® which is bounded with respect to the unit form (DwPDw) + (wRw). 


Therefore a function f in © exists such that 


(DwPDf) + (wRf) = (wRh) 
*° Cf. the corresponding reasoning in K. Friedrichs, Mathematische Annalen Bd, 109, 


8S. 465, 1934, H. Freudenthal, Proc. Kon. Acad. Wet Amsterdam, 39, N. 7, 1936. 
19 


ig 
i 
4 
i 
| 
| 


538 KURT FRIEDRICHS. 


for all win ©. From 10.3 or 11. 2 we obtain that f is in § and (ED +1)f =h, 
Before applying this lemma we remark that 12. 2 is evidently equivalent to 
12.2’. If u(x), v,;(x) are functions in § such that 


(12. 2”) (wRv,) = ((ZD + 1)wRu) 


for all w in & then u(z) is in and (HD + 1)u(z) = »,(z). 

Let u(x), vi(z) be such functions. In view of Lemma 12.1 there is a 
function in & such that (HD+ 1)u,(z) Hence, from 
(12. 2’), + 1)wRu) = (wR(ED + = +1) wRu,) in view 
of 12.1. Thus ((1D+1)wR(u,—u)) =0. On applying the lemma again 
we may choose the function w(z) such that (2D + 1)w(x) = u(x) —u(z); 
thus we get 

((u, —u)R(u, —u)) = 0, hence in %. 


13. Modification of the matrix P. Let 9¢:(z),---,qn(x) be continuous 
functions in S which have continuous first derivatives. Then we define a 
“ difference operator ” AP by the matrix 


Apur(z) =0 v0 
Apov(x) = qv(z) v0 
Apyo(«) = qu(x) 


Apoo(t) = (Digi + Dngn(z)). 


On setting 
AE = R“D*aP 


we have 


LemMA 13.1. If AP is a difference operator, u(x) in ® then APDu(z) 
is in and AEDu(x) = 0. 


We set = (0, qn); then we have 


Let w(x) be a function in D then we obtain from 5.1 


J = f [| wD*q’ — D*wd = Dw fudz. 
Js 


Hence 6.2 shows that g’u is in R’ and D*q’u = D*q’-u— q/Du; therefore 
APDu(z) is in R’, too, and D*APDu(z) =0. By the matrix 


Wi 


In 


= 
is 
(§ 
th 
sp 
wi 
be 

we 
Bi 
eX 
4 

: 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 539 
Too (2) =r(z), ruv(x) = 0 otherwise, 


we introduce an operator which we denote also by R. 


? “admissible” to P if a constant 


Then we term the difference operator AF 
¢ exists such that the operator 


P+AaP+cR 


is positive definite in every point of S. For this operator we define the spaces 
B®(AP), Gx (AP), B(AP), Hux (AP)* in the same way as the spaces Go, Goo, 
goo for the operator P. 

If AP vanishes in a neighborhood of the boundary it is admissible and 
the spaces Go, Gao, Bo: Boo are not changed. Thus the modification of these 
spaces depends on the behavior of AP only at the boundary of S. Therefore 
we can say the operators HD in %(AP) differ only in their behavior at the 
boundary and the manifold of admissible AP represents a manifold of boundary 
conditions. But this does not concern the operator HD in a; for we have 

THEOREM 13.1. = Foo. 

According to Lemma 13.1 we then have 

(#H + AF) = EDu(z). 

We first prove (FAP) = Fu. From 

(13.1) (0, Po’) + [P + AP + cR]v’) = 2(v’, [P + 44P 4+ $cR]0’) 


we see that the difference operator AP is admissible and that Ga Gu (FAP). 
In view of Lemma 13.1 we then obtain 


Yoo D Yoo (FAP). 


But since HD in %u«(4AP) is self-adjoint it has no different self-adjoint 
extension. Therefore 
Ha = Hu (FAP). 


In the same way we deduce from (13. 1) 
(AP) = Fu (ZAP). 


Thus Theorem 13.1 is proved. 


*7 Obviously they do not depend on ec. 


i 
| 
| 
| 
| 


540 KURT FRIEDRICHS. 


14. Example. Let the domain 8S consist of all points x except «=0; 
we take 
r(z) = 1 


and define the form P by 


where a, b, c are constants such that a > 0, c > 0, ac—b? >0. Since the 
form P is positive-definite we may apply the theory of §7—§12 and define 
spaces ou and % in which the operator HD is self-adjoint.2* This operator 
van be written as 


ED =—a(D,? +: D,?)u(z) — (n—1) u(x) + cu(z). 


e? me* 
3, a= om? ¢> the operator HD —c is 
m 


On setting n= 


Schrédinger’s energy operator of the hydrogen atom. We, therefore, know 
that this operator is self-adjoint in the space %. 

Usually, Schrédinger’s operator or generally the above operator LD is 
connected with the quadratic form 


Pov’ =a(v,2 +: + vn?) — (n—1) Te] Vo" + 


but this form is everywhere positive-definite only if b= 0; then the difference 
form 


—P)v 2% (Fo + + (n 


is admissible in our sense. On the other hand, in case 6 > 0 it is preferable 
to work with the form P instead of P>. 


15. Continuity. In this paragraph we prove the continuity test men- 
tioned in the introduction. To give it a precise form we introduce the space 
®, and operators A,, which are applicable in &, and produce functions in § 
if r is even, in §’ if ris odd. We define these spaces and operators by recursion. 


28 Se 7 ave = (% ‘ = Ss 1 

In case n > 2 we have Gq = (Hy; hence &,’ = Gono and Fy = Foo} thus botl 
boundary conditions lead to the same space %. For the proof of this fact one may use 
the same reasoning as for Theorem 2.3 of my paper in the Mathematische Annalen Bd. 


112, p. 8 (1935) (on setting y = ( dia| ). 


n-1 
e 


fo 


al 


V: 
V: 
Wi 
ap 
wl 
ha 
re 
h 


ON DIFFERENTIAL OPERATORS LN HILBERT SPACES. 541 


V.i=—D. 

2. Sep consists of all u(x) in for which Vop-.u is in 
Vp = D*V 2p-1- 

3. Rep. consists of all u(x) in Rep for which Yopu is in &; 
V 2p+1 DV 2p. 


We notice that the spaces 8, actually do not depend on the number 8 which 
appeared in the definition of D and D* (cf. § 4). 


Introducing the number 


where m is the dimension of the a-space, we have 


THEOREM 15.1. Jf r=—m the functions u(x) in R, are continuous and 


have continuous deriwatives up to the order r — m.*° 
Since the statement does not depend on the number 6 we may choose 
§=0; i.e, Du = {0, Du,- -, Dru}, D¥u’ = — Dy, — 


We denote by V,* the formal adjoint of Vr; i.e, Vr*¥ = D*- - - D*D 
for r even and’ V,* = - DD* for r odd. 


We introduce for even positive r the function 
Gt (a) =| (yr + Br log | |) 
and for odd positive r the system = {0, Gnu™} where 


(+, + B, log | |), 


Gy" (x) = 


defined for c40. The coefficients y, and £, may be chosen such that the 


relations 
DG" = —G"" for r even, 
= — for r=3 odd, 
D*@ = 0 


hold. y, and f, are uniquely determined by these relations, by the initial 
values y, —,"' where ©, is the area of the surface of the n-dimensional 
unit-sphere and by the condition yn = 0 for n even. 

We notice that V,*G" =0; thus G@* is the fundamental solution of the 
differential operator 


229 Cf 9 10 
Jae 9 


if 

n 

m= [=] + 1, 

2 

| 


542 KURT FRIEDRICHS. 


We introduce a function ynr(t), which has derivatives of every order such 
that which yne(t) = 1 for |t| = 0S Sl as RStS 2K, yr(t) =0 
as |¢|=2R. Let S’ << S and a such that S’or < 8S (cf. §2). For 2, in §’ 
we set 


= 2) = — 2) — 2). 
This function vanishes outside of S’sr and 
V-*k' (z) = V-*K" (2,2) =0 


there and in the neighborhood of 2 = a». 
Let v(x) be a function, defined in S, which has derivatives of every order, 


Then we can represent this function by means of the fundamental solution. 


(15.1) v(2o) ——f. (a, 2) Vru(x)dz +f. (a, x) v(x) dz. 


We now use the operator Ja (cf. §2) where the kernel function j,(z) 
has derivatives of every order. As an immediate consequence of Lemma 5. 1 
and 6.1 we have 
Vra JaVr 
and, in view of 2. 2, 


| VrIa—Vr 
for every u(x) in R,. 
We first prove that u(x) in &, is continuous if r= m. We take r=m. 
The functions k"(a) can be considered as functions in § as m is even, in 6! 
for m odd; for 


a 
f | k(x) |?da =, | Ym 
<a 0 
+ Bm log | < « 


Let u(x) be a function in Rn. We in (15.1); 
clearly the right hand side converges uniformly for z» in S’ to a continuous 
function We have | Jaw—u°|y—>0 as well as 0 
(cf. 2.2); thus | w—w® |y w(x) = u(x) is in S’ and, therefore, 
u(x) is continuous. 


In order to prove the statement concerning derivatives we introduce the 
operators Dr” 
Dr-™ Dy, Dv, 


r-m* 


We may apply D’-” on both sides of (15.1) and obtain 


| 
a 
§ 
| 

| 

| 


ON DIFFERENTIAL OPERATORS IN HILBERT SPACES. 543 


(15.2) Dr-™v(2) K" &) Vrv (x) dx 


where the subscript (0) indicates that the operator acts with respect to 2. 
The function k(x) = Dgy K" (Lo, x) is in § as r is even, in § as r is odd, 
since 


| |? — |"* S const | ay |?"-" | + B, log | — a | |? 


in the neighborhood of x = Zp. 

Therefore we may apply the same reasoning as before and find that 
tends to a continuous function uniformly in On 
assuming that our statement is true for »—1 we know that D*-""J,u(a) 
tends to uniformly in 8’. Writing = we 
see that D™”-'u(a) has the continuous derivative u(x). Thus theorem 
15.1 is proved. 

We introduce a space ®* and n® operators D* in §* by the recursion: 
=: R and D' = Dy, v=1,- +, n; consists of all u(x) in K** for which 
each is in R; = u(x), v= 1,---,n. Then we state 


THEOREM 15.2. Let u(x) be in Rep, Vopu(x) be in R*%. Then u(2) 
is in R8*1, each D8u(x) is in Rep and 


opt = VapD*u = V 2p-2D*DD*u. 


It is sufficient to prove the statement for s 1; for then the statement follows 
by induction for s > 1. 
From (15.1) we get by differentiation and partial integration 


Dyv (2) = — f, K?? (xo, ©) DvV (x) dx — A DyV * op (a, x) v(x) da, 
S’oR 


observing that the integrands vanish outside of S’zz. 


From this we obtain the inequality 


| Dv 


| dx | DV |s'op 
4 f | |2dz |v 
|z|<2R 


by the same reasoning that led to (2.1), taking K*?(r) and DyvV*2»K*(z) 


instead of ju(x). We now set v(x) =Jau(x); we have | Jau—u lv p 20 


i 
| 
| 
f 
t 
q 
| 
| p 
| 


544 KURT FRIEDRICHS. 


and, since DV opJau(x) = Vepu(xr), also | DV xpJau— DV 2pu 0. 
Hence | DJgu— DJyu|y as Therefore a system w’(x) in & 
exists such that | DJawu—u’|s-~>0. This means, according to 5.3, that 
u(x) isin R and Du=w’. 

We apply the same reasoning to V;rDvv(ao), taking (—1)"V-K??(2, 2) 
instead of K*?(2, x); we obtain that Dvu is in Rr, -,2p and we 
deduce VopDu = DvVop from VopDJau= DvVopJau. Thus Theorem 
15. 2 is proved. 

In immediate consequence of Theorem 15. 2 is 


THEOREM 15.3. Let q(x) and f(x) be functions defined in S where 
q(x) has continuous derivatives up to the order s while f(x) is in R*%. Tf u 
in &, is a solution of 

D*Du + qu=f 


then u is in R*; if s= m then u is continuous and has continuous derivatives 


up lo the order s—m. 


First we observe that gu is in if wu is in For 
o = 1, this is an immediate consequence of 5. 2 applied on qu and qDu + Dqu; 
for o > 1 we may apply induction. 

Since w is in ', also D*Du = f — qu is in &'; hence we learn from 
Theorem 15.2 that w is in 8; therefore D*Du =f — qu is in 8°, too, and 
wu is in 

Repeating this reasoning we find that uw is in &*. Since, obviously, & 
contains R, we may apply Theorem 15.1 in case s = m. 

In view of the remarks made in the introduction this theorem implies 
that the characteristic functions of the operator D*D + q have continuous 


derivatives up to the order s— m if q has continuous derivatives up to the 


order s. 


NEw York UNIVERSITY, 
New York, N. Y. 


j 

} 

| 


hy 


